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SERIES  PREFACE 


In  the  Essentials  of  Behavioral  Science  series,  our  goal  is  to  provide  readers 
with  books  that  will  deliver  key  practical  information  in  an  efficient,  ac¬ 
cessible  style.  The  series  features  books  on  a  variety  of  topics,  such  as 
statistics,  psychological  testing,  and  research  design  and  methodology,  to 
name  just  a  few.  For  the  experienced  professional,  books  in  the  series  offer 
a  concise  yet  thorough  review  of  a  specific  area  of  expertise,  including  nu¬ 
merous  tips  for  best  practices.  Students  can  turn  to  series  books  for  a  clear 
and  concise  overview  of  the  important  topics  in  which  they  must  become 
proficient  to  practice  skillfully,  efficiently,  and  ethically  in  their  chosen 
fields. 

Wherever  feasible,  visual  cues  highlighting  key  points  are  utilized 
alongside  systematic,  step-by-step  guidelines.  Chapters  are  focused  and 
succinct.  Topics  are  organized  for  an  easy  understanding  of  the  essential 
material  related  to  a  particular  topic.  Theory  and  research  are  continually 
woven  into  the  fabric  of  each  book,  but  always  to  enhance  the  practical 
application  of  the  material,  rather  than  to  sidetrack  or  overwhelm  readers. 
With  this  series,  we  aim  to  challenge  and  assist  readers  in  the  behavioral 
sciences  to  aspire  to  the  highest  level  of  competency  by  arming  them  with 
the  tools  they  need  for  knowledgeable,  informed  practice. 

The  purposes  of  Essentials  of  Research  Design  and  Methodology  are  to  dis¬ 
cuss  the  various  types  of  research  designs  that  are  commonly  used,  the  ba¬ 
sic  process  by  which  research  studies  are  conducted,  the  research-related 
considerations  of  which  researchers  should  be  aware,  the  manner  in  which 
the  results  of  research  can  be  interpreted  and  disseminated,  and  the  typi- 

ix 
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cal  pitfalls  faced  by  researchers  when  designing  and  conducting  a  research 
study.  This  book  is  ideal  for  those  readers  with  minimal  knowledge  of  re¬ 
search  as  well  as  for  those  readers  with  intermediate  knowledge  who  need 
a  quick  refresher  regarding  particular  aspects  of  research  design  and 
methodology.  For  those  readers  with  an  advanced  knowledge  of  research 
design  and  methodology,  this  book  can  be  used  as  a  concise  summary  of 
basic  research  techniques  and  principles,  or  as  an  adjunct  to  a  more  ad¬ 
vanced  research  methodology  and  design  textbook.  Finally,  even  for  those 
readers  who  do  not  conduct  research,  this  book  will  become  a  valuable 
addition  to  your  bookcase  because  it  will  assist  you  in  becoming  a  more 
educated  consumer  of  research.  Being  able  to  evaluate  the  appropriate¬ 
ness  of  a  research  design  or  the  conclusions  drawn  from  a  particular  re¬ 
search  study  will  become  increasingly  more  important  as  research  be¬ 
comes  more  accessible  to  nonscientists.  In  that  regard,  this  book  will 
improve  your  ability  to  efficiently  and  effectively  digest  and  understand 
the  results  of  a  research  study. 

Alan  ti.  Kaufman,  PhD,  and  Nadeen  L.  Kaufman,  EdD,  Founding  Editors 

Yale  University  School  of  Medicine 
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One 


INTRODUCTION  AND  OVERVIEW 


Progress  in  almost  every  field  of  science  depends  on  the  contribu¬ 
tions  made  by  systematic  research;  thus  research  is  often  viewed  as 
the  cornerstone  of  scientific  progress.  Broadly  defined,  the  purpose 
of  research  is  to  answer  questions  and  acquire  new  knowledge.  Research 
is  the  primary  tool  used  in  virtually  all  areas  of  science  to  expand  the  fron¬ 
tiers  of  knowledge.  For  example,  research  is  used  in  such  diverse  scientific 
fields  as  psychology,  biology,  medicine,  physics,  and  botany,  to  name  just 
a  few  of  the  areas  in  which  research  makes  valuable  contributions  to  what 


we  know  and  how  we  think  about  things.  Among  other  things,  by  con¬ 
ducting  research,  researchers  attempt  to  reduce  the  complexity  of  prob¬ 
lems,  discover  the  relationship  between  seemingly  unrelated  events,  and 
ultimately  improve  the  way  we  live. 

Although  research  studies  are  conducted  in  many  diverse  fields  of  sci¬ 
ence,  the  general  goals  and  defining  characteristics  of  research  are  typically 
the  same  across  disciplines.  For  example,  across  all  types  of  science,  re¬ 
search  is  frequentiy  used  for  describing  a  thing  or  event,  discovering  the 
relationship  between  phenomena,  or  making  predictions  about  future 
events.  In  short,  research  can  be  used  for  the  purposes  of  description,  ex¬ 
planation,  and  prediction,  all  of  which  make  important  and  valuable  con¬ 
tributions  to  the  expansion  of  what  we  know  and  how  we  live  our  lives.  In 
addition  to  sharing  similar  broad  goals,  scientific  research  in  virtually  all 
fields  of  study  shares  certain  defining  characteristics,  including  testing 
hypotheses,  careful  observation  and  measurement,  systematic  evaluation 
of  data,  and  drawing  valid  conclusions. 
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2  ESSENTIALS  OF  RESEARCH  DESIGN  AND  METHODOLOGY 


In  recent  years,  the  results  of  various  research  studies  have  taken  center 
stage  in  the  popular  media.  No  longer  is  research  the  private  domain  of  re¬ 
search  professors  and  scientists  wearing  white  lab  coats.  To  the  contrary, 
the  results  of  research  studies  are  frequendy  reported  on  the  local  evening 
news,  CNN,  the  Internet,  and  various  other  media  oudets  that  are  acces¬ 
sible  to  both  scientists  and  nonscientists  alike.  For  example,  in  recent 
years,  we  have  all  become  familiar  with  research  regarding  the  effects  of 
stress  on  our  psychological  well-being,  the  health  benefits  of  a  low- 
cholesterol  diet,  the  effects  of  exercise  in  preventing  certain  forms  of  can¬ 
cer,  which  automobiles  are  safest  to  drive,  and  the  deleterious  effects  of 
pollution  on  global  warming.  We  may  have  even  become  familiar  with  re¬ 
search  studies  regarding  the  human  genome,  the  Mars  Land  Rover,  the  use 
of  stem  cells,  and  genetic  cloning.  Not  too  long  ago,  it  was  unlikely  that  the 
results  of  such  highly  scientific  research  studies  would  have  been  shared 
with  the  general  public  to  such  a  great  extent. 

Despite  the  accessibility  and  prevalence  of  research  in  today’s  society, 
many  people  share  common  misperceptions  about  exactly  what  research 
is,  how  research  can  be  used,  what  research  can  tell  us,  and  the  limitations 
of  research.  For  some  people,  the  term  “research”  conjures  up  images  of 
scientists  in  laboratories  watching  rats  run  through  mazes  or  mixing 
chemicals  in  test  tubes.  For  other  people,  the  term  “research”  is  associated 
with  telemarketer  surveys,  or  people  approaching  them  at  the  local  shop¬ 
ping  mall  to  “just  ask  you  a  few  questions  about  your  shopping  habits.”  In 
actuality,  these  stereotypical  examples  of  research  are  only  a  small  part  of 
what  research  comprises.  It  is  therefore  not  surprising  that  many  people 
are  unfamiliar  with  the  various  types  of  research  designs,  the  basics  of  how 
research  is  conducted,  what  research  can  be  used  for,  and  the  limits  of  us¬ 
ing  research  to  answer  questions  and  acquire  new  knowledge.  Rapid  Ref¬ 
erence  1.1  discusses  what  we  mean  by  “research”  from  a  scientific  per¬ 
spective. 

Before  addressing  these  important  issues,  however,  we  should  first 
briefly  review  what  science  is  and  how  it  goes  about  telling  us  what  we 
know. 
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What  Exactly  is  Research? 

Research  studies  come  in  many  different  forms,  and  we  will  discuss  sev¬ 
eral  of  these  forms  in  more  detail  in  Chapter  5.  For  now,  however,  we  will 
focus  on  two  of  the  most  common  types  of  research — correlational  re¬ 
search  and  experimental  research. 

Correlational  research:  In  correlational  research,  the  goal  is  to  deter¬ 
mine  whether  two  or  more  variables  are  related.  (By  the  way,  “variables”  is 
a  term  with  which  you  should  be  familiar  A  variable  is  anything  that  can 
take  on  different  values,  such  as  weight,  time,  and  height.)  For  example,  a 
researcher  may  be  interested  in  determining  whether  age  is  related  to 
weight.  In  this  example,  a  researcher  may  discover  that  age  is  indeed  re¬ 
lated  to  weight  because  as  age  increases,  weight  also  increases.  If  a  corre¬ 
lation  between  two  variables  is  strong  enough,  knowing  about  one  vari¬ 
able  allows  a  researcherto  make  a  prediction  about  the  other  variable. 
There  are  several  different  types  of  correlations,  which  will  be  discussed  in 
more  detail  in  Chapter  5.  It  is  important  to  point  out,  however;  that  a  cor¬ 
relation — or  relationship — between  two  things  does  not  necessarily 
mean  that  one  thing  caused  the  otherTo  draw  a  cause-and-effect  conclu¬ 
sion,  researchers  must  use  experimental  research. This  point  will  be  em¬ 
phasized  throughout  this  book. 

Experimental  research:  In  its  simplest  form,  experimental  research  in¬ 
volves  comparing  two  groups  on  one  outcome  measure  to  test  some  hy¬ 
pothesis  regarding  causation.  For  example,  if  a  researcher  is  interested  in 
the  effects  of  a  new  medication  on  headaches,  the  researcher  would  ran¬ 
domly  divide  a  group  of  people  with  headaches  into  two  groups.  One  of 
the  groups,  the  experimental  group,  would  receive  the  new  medication  be¬ 
ing  tested. The  other  group,  the  control  group,  would  receive  a  placebo 
medication  (i.e.,  a  medication  containing  a  harmless  substance,  such  as 
sugar;  that  has  no  physiological  effects).  Besides  receiving  the  different 
medications,  the  groups  would  be  treated  exactly  the  same  so  that  the  re¬ 
search  could  isolate  the  effects  of  the  medications.  After  receiving  the 
medications,  both  groups  would  be  compared  to  see  whether  people  in 
the  experimental  group  had  fewer  headaches  than  people  in  the  control 
group.  Assuming  this  study  was  properly  designed  (and  properly  designed 
studies  will  be  discussed  in  detail  in  later  chapters),  if  people  in  the  experi¬ 
mental  group  had  fewer  headaches  than  people  in  the  control  group,  the 
researcher  could  conclude  that  the  new  medication  reduces  headaches. 
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4  ESSENTIALS  OF  RESEARCH  DESIGN  AND  METHODOLOGY 


OVERVIEW  OF  SCIENCE  AND  THE  SCIENTIFIC  METHOD 

In  simple  terms,  science  can  be  defined  as  a  methodological  and  systematic 
approach  to  the  acquisition  of  new  knowledge.  This  definition  of  science 
highlights  some  of  the  key  differences  between  how  scientists  and  non¬ 
scientists  go  about  acquiring  new  knowledge.  Specifically,  rather  than 
relying  on  mere  casual  observations  and  an  informal  approach  to  learn 
about  the  world,  scientists  attempt  to  gain  new  knowledge  by  making  care¬ 
ful  observations  and  using  systematic,  controlled,  and  methodical  ap¬ 
proaches  (Shaughnessy  &  Zechmeister,  1997).  By  doing  so,  scientists  are 
able  to  draw  valid  and  reliable  conclusions  about  what  they  are  studying. 
In  addition,  scientific  knowledge  is  not  based  on  the  opinions,  feelings,  or 
intuition  of  the  scientist.  Instead,  scientific  knowledge  is  based  on  objec¬ 
tive  data  that  were  reliably  obtained  in  the  context  of  a  carefully  designed 
research  study.  In  short,  scientific  knowledge  is  based  on  the  accumulation 
of  empirical  evidence  (Kazdin,  2003a),  which  will  be  the  topic  of  a  great 
deal  of  discussion  in  later  chapters  of  this  book. 

The  defining  characteristic  of  scientific  research  is  the  scientific 
method  (summarized  in  Rapid  Reference  1 .2) .  First  described  by  the  En¬ 
glish  philosopher  and  scientist  Roger  Bacon  in  the  13th  century,  it  is  still 
generally  agreed  that  the  scientific  method  is  the  basis  for  all  scientific  in¬ 
vestigation.  The  scientific  method  is  best  thought  of  as  an  approach  to  the 
acquisition  of  new  knowledge,  and  this  approach  effectively  distinguishes 
science  from  nonscience.  To  be  clear,  the  scientific  method  is  not  actually 
a  single  method,  as  the  name  would  erroneously  lead  one  to  believe,  but 
rather  an  overarching  perspective  on  how  scientific  investigations  should 
proceed.  It  is  a  set  of  research  principles  and  methods  that  helps  re¬ 
searchers  obtain  valid  results  from  their  research  studies.  Because  the  sci¬ 
entific  method  deals  with  the  general  approach  to  research  rather  than  the 
content  of  specific  research  studies,  it  is  used  by  researchers  in  all  different 
scientific  disciplines.  As  will  be  seen  in  the  following  sections,  the  biggest 
benefit  of  the  scientific  method  is  that  it  provides  a  set  of  clear  and  agreed- 
upon  guidelines  for  gathering,  evaluating,  and  reporting  information  in 
the  context  of  a  research  study  (Cozby,  1993). 
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The  Scientific  Method 

The  development  of  the  scientific  method  is  usually  credited  to  Roger 
Bacon,  a  philosopher  and  scientist  from  I  3th-century  England,  although 
some  argue  that  the  Italian  scientist  Galileo  Galilei  played  an  important 
role  in  formulating  the  scientific  method.  Later  contributions  to  the  scien¬ 
tific  method  were  made  by  the  philosophers  Francis  Bacon  and  Rene 
Descartes.  Although  some  disagreement  exists  regarding  the  exact  char¬ 
acteristics  of  the  scientific  method,  most  agree  that  it  is  characterized  by 
the  following  elements: 

•  Empirical  approach 

•  Observations 

•  Questions 

•  Hypotheses 

•  Experiments 

•  Analyses 

•  Conclusions 

•  Replication 


There  has  been  some  disagreement  among  researchers  over  the  years 
regarding  the  elements  that  compose  the  scientific  method.  In  fact,  some 
researchers  have  even  argued  that  it  is  impossible  to  define  a  universal  ap¬ 
proach  to  scientific  investigation.  Nevertheless,  for  over  100  years,  the 
scientific  method  has  been  the  defining  feature  of  scientific  research.  Re¬ 
searchers  generally  agree  that  the  scientific  method  is  composed  of  the 
following  key  elements  (which  will  be  the  focus  of  the  remainder  of  this 
chapter):  an  empirical  approach,  observations,  questions,  hypotheses,  ex¬ 
periments,  analyses,  conclusions,  and  replication. 

Before  proceeding  any  further,  one  word  of  caution  is  necessary.  In  the 
brief  discussion  of  the  scientific  method  that  follows,  we  will  be  introduc¬ 
ing  several  new  terms  and  concepts  that  are  related  to  research  design  and 
methodology.  Do  not  be  intimidated  if  you  are  unfamiliar  with  some  of  the 
content  contained  in  this  discussion.  The  purpose  of  the  following  is  simply 
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to  set  the  stage  for  the  chapters  that  follow,  and  we  will  be  elaborating  on 
each  of  the  terms  and  concepts  throughout  the  remainder  of  the  book. 

Empirical  Approach 

The  scientific  method  is  firmly  based  on  the  empirical  approach.  The  em¬ 
pirical  approach  is  an  evidence-based  approach  that  relies  on  direct  obser¬ 
vation  and  experimentation  in  the  acquisition  of  new  knowledge  (see 
Kazdin,  2003a).  In  the  empirical  approach,  scientific  decisions  are  made 
based  on  the  data  derived  from  direct  observation  and  experimentation. 
Contrast  this  approach  to  decision  making  with  the  way  that  most  nonsci- 
entific  decisions  are  made  in  our  daily  lives.  For  example,  we  have  all  made 
decisions  based  on  feelings,  hunches,  or  “gut”  instinct.  Additionally,  we 
may  often  reach  conclusions  or  make  decisions  that  are  not  necessarily 
based  on  data,  but  rather  on  opinions,  speculation,  and  a  hope  for  the  best. 
The  empirical  approach,  with  its  emphasis  on  direct,  systematic,  and  care¬ 
ful  observation,  is  best  thought  of  as  the  guiding  principle  behind  all  re¬ 
search  conducted  in  accordance  with  the  scientific  method. 

Observations 

An  important  component  in  any  scientific  investigation  is  observation.  In 
this  sense,  observation  refers  to  two  distinct  concepts — being  aware  of  the 
world  around  us  and  making  careful  measurements.  Observations  of  the 
world  around  us  often  give  rise  to  the  questions  that  are  addressed  through 
scientific  research.  For  example,  the  Newtonian  observation  that  apples 
fall  from  trees  stimulated  much  research  into  the  effects  of  gravity.  There¬ 
fore,  a  keen  eye  to  your  surroundings  can  often  provide  you  with  many 
ideas  for  research  studies.  We  will  discuss  the  generation  of  research  ideas 
in  more  detail  in  Chapter  2. 

In  the  context  of  science,  observation  means  more  than  just  observing 
the  world  around  us  to  get  ideas  for  research.  Observation  also  refers  to  the 
process  of  making  careful  and  accurate  measurements,  which  is  a  distin¬ 
guishing  feature  of  well-conducted  scientific  investigations.  When  making 
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measurements  in  the  context  of  research,  scientists  typically  take  great 
precautions  to  avoid  making  biased  observations.  For  example,  if  a  re¬ 
searcher  is  observing  the  amount  of  time  that  passes  between  two  events, 
such  as  the  length  of  time  that  elapses  between  lightning  and  thunder,  it 
would  certainly  be  advisable  for  the  researcher  to  use  a  measurement  de¬ 
vice  that  has  a  high  degree  of  accuracy  and  reliability.  Rather  than  simply 
trying  to  “guesstimate”  the  amount  of  time  that  elapsed  between  those 
two  events,  the  researcher  would  be  advised  to  use  a  stopwatch  or  similar 
measurement  device.  By  doing  so,  the  researcher  ensures  that  the  mea¬ 
surement  is  accurate  and  not  biased  by  extraneous  factors.  Most  people 
would  likely  agree  that  the  observations  that  we  make  in  our  daily  lives  are 
rarely  made  so  carefully  or  systematically. 

An  important  aspect  of  measurement  is  an  operational  definition.  Re¬ 
searchers  define  key  concepts  and  terms  in  the  context  of  their  research 
studies  by  using  operational  definitions.  By  using  operational  definitions, 
researchers  ensure  that  everyone  is  talking  about  the  same  phenomenon. 
For  example,  if  a  researcher  wants  to  study  the  effects  of  exercise  on  stress 
levels,  it  would  be  necessary  for  the  researcher  to  define  what  “exercise” 
is.  Does  exercise  refer  to  jogging,  weight  lifting,  swimming,  jumping  rope, 
or  all  of  the  above?  By  defining  “exercise”  for  the  purposes  of  the  study, 
the  researcher  makes  sure  that  everyone  is  referring  to  the  same  thing. 
Clearly,  the  definition  of  “exercise”  can  differ  from  one  study  to  another, 
so  it  is  crucial  that  the  researcher  define  “exercise”  in  a  precise  manner  in 
the  context  of  his  or  her  study.  Flaving  a  clear  definition  of  terms  also 
ensures  that  the  researcher’s  study  can  be  replicated  by  other  researchers. 
The  importance  of  operational  definitions  will  be  discussed  further  in 
Chapter  2. 

Questions 

After  getting  a  research  idea,  perhaps  from  making  observations  of  the 
world  around  us,  the  next  step  in  the  research  process  involves  translating 
that  research  idea  into  an  answerable  question.  The  term  “answerable”  is 
particularly  important  in  this  respect,  and  it  should  not  be  overlooked.  It 
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would  obviously  be  a  frustrating  and  ultimately  unrewarding  endeavor  to 
attempt  to  answer  an  unanswerable  research  question  through  scientific 
investigation.  An  example  of  an  unanswerable  research  question  is  the  fol¬ 
lowing:  “Is  there  an  exact  replica  of  me  in  another  universe?”  Although 
this  is  certainly  an  intriguing  question  that  would  likely  yield  important  in¬ 
formation,  the  current  state  of  science  cannot  provide  an  answer  to  that 
question.  It  is  therefore  important  to  formulate  a  research  question  that 
can  be  answered  through  available  scientific  methods  and  procedures. 
One  might  ask,  for  example,  whether  exercising  (i.e.,  perhaps  opera¬ 
tionally  defined  as  running  three  times  per  week  for  30  minutes  each  time) 
reduces  cholesterol  levels.  This  question  could  be  researched  and  an¬ 
swered  using  established  scientific  methods. 

Hypotheses 

The  next  step  in  the  scientific  method  is  coming  up  with  a  hypothesis,  which 
is  simply  an  educated — and  testable — guess  about  the  answer  to  your 
research  question.  A  hypothesis  is  often  described  as  an  attempt  by  the  re¬ 
searcher  to  explain  the  phenomenon  of  interest.  Hypotheses  can  take  var¬ 
ious  forms,  depending  on  the  question  being  asked  and  the  type  of  study 
being  conducted  (see  Rapid  Reference  1.3). 

A  key  feature  of  all  hypotheses  is  that  each  must  make  a  prediction.  Re¬ 
member  that  hypotheses  are  the  researcher’s  attempt  to  explain  the  phe¬ 
nomenon  being  studied,  and  that  explanation  should  involve  a  prediction 
about  the  variables  being  studied.  These  predictions  are  then  tested  by 
gathering  and  analyzing  data,  and  the  hypotheses  can  either  be  supported 
or  refuted  (falsified;  see  Rapid  Reference  1.4)  on  the  basis  of  the  data. 

In  their  simplest  forms,  hypotheses  are  typically  phrased  as  “if-then” 
statements.  For  example,  a  researcher  may  hypothesize  that  “if  people 
exercise  for  30  minutes  per  day  at  least  three  days  per  week,  then  their  cho¬ 
lesterol  levels  will  be  reduced.”  This  hypothesis  makes  a  prediction  about 
the  effects  of  exercising  on  levels  of  cholesterol,  and  the  prediction  can  be 
tested  by  gathering  and  analyzing  data. 

Two  types  of  hypotheses  with  which  you  should  be  familiar  are  the  null 
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Relationship  Between  Hypotheses  and  Research  Design 

Hypotheses  can  take  many  different  forms  depending  on  the  type  of  re¬ 
search  design  being  used.  Some  hypotheses  may  simply  describe  how  two 
things  may  be  related.  For  example,  in  correlational  research  (which  will 
be  discussed  in  Chapter  5),  a  researcher  might  hypothesize  that  alcohol 
intoxication  is  related  to  poor  decision  making.  In  other  words,  the  re¬ 
searcher  is  hypothesizing  that  there  is  a  relationship  between  using  alco¬ 
hol  and  decision  making  ability  (but  not  necessarily  a  causal  relationship). 
However,  in  a  study  using  a  randomized  controlled  design  (which  will  also 
be  discussed  in  Chapter  5),  the  researcher  might  hypothesize  that  using 
alcohol  causes  poor  decision  making.Therefore,  as  may  be  evident,  the 
hypothesis  being  tested  by  a  researcher  is  largely  dependent  on  the  type 
of  research  design  being  used. The  relationship  between  hypotheses  and 
research  design  will  be  discussed  in  more  detail  in  later  chapters. 


— flap/d Reference  /  / 


Falsifiability  of  Hypotheses 

According  to  the  20th-century  philosopher  Karl  Popper;  hypotheses  must 
be  falsifiable  (Popper,  1 963).  In  other  words,  the  researcher  must  be  able 
to  demonstrate  that  the  hypothesis  is  wrong.  If  a  hypothesis  is  not  falsifi¬ 
able,  then  science  cannot  be  used  to  test  the  hypothesis.  For  example,  hy¬ 
potheses  based  on  religious  beliefs  are  not  falsifiable. Therefore,  because 
we  can  never  prove  that  faith-based  hypotheses  are  wrong,  there  would 
be  no  point  in  conducting  research  to  test  them.  Another  way  of  saying 
this  is  that  the  researcher  must  be  able  to  reject  the  proposed  explana¬ 
tion  (i.e.,  hypothesis)  of  the  phenomenon  being  studied. 


hypothesis  and  the  alternate  (or  experimental)  hypothesis.  The  null  hypoth¬ 
esis  always  predicts  that  there  will  be  no  differences  between  the  groups  be¬ 
ing  studied.  By  contrast,  the  alternate  hypothesis  predicts  that  there  will  be  a 
difference  between  the  groups.  In  our  example,  the  null  hypothesis  would 
predict  that  the  exercise  group  and  the  no-exercise  group  will  not  differ 
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significantly  on  levels  of  cholesterol.  The  alternate  hypothesis  would  pre¬ 
dict  that  the  two  groups  will  differ  significandy  on  cholesterol  levels.  Hy¬ 
potheses  will  be  discussed  in  more  detail  in  Chapter  2. 


Experiments 

After  articulating  the  hypothesis,  the  next  step  involves  actually  conduct¬ 
ing  the  experiment  (or  research  study).  For  example,  if  the  study  involves 
investigating  the  effects  of  exercise  on  levels  of  cholesterol,  the  researcher 
would  design  and  conduct  a  study  that  would  attempt  to  address  that  ques¬ 
tion.  As  previously  mentioned,  a  key  aspect  of  conducting  a  research  study 
is  measuring  the  phenomenon  of  interest  in  an  accurate  and  reliable  manner 
(see  Rapid  Reference  1.5).  In  this  example,  the  researcher  would  collect 
data  on  the  cholesterol  levels  of  the  study  participants  by  using  an  accurate 
and  reliable  measurement  device.  Then,  the  researcher  would  compare  the 
cholesterol  levels  of  the  two  groups  to  see  if  exercise  had  any  effects. 


— flap/d Reference  fS 


Accuracy  vs.  Reliability 

When  talking  about  measurement  in  the  context  of  research,  there  is  an 
important  distinction  between  being  accurate  and  being  reliable.  Accuracy 
refers  to  whether  the  measurement  is  correct,  whereas  reliability  refers  to 
whether  the  measurement  is  consistent.  An  example  may  help  to  clarify 
the  distinction.  When  throwing  darts  at  a  dart  board,  “accuracy”  refers  to 
whether  the  darts  are  hitting  the  bull’s  eye  (an  accurate  dart  thrower  will 
throw  darts  that  hit  the  bull’s  eye). “Reliability,”  on  the  other  hand,  refers 
to  whetherthe  darts  are  hitting  the  same  spot  (a  reliable  dart  thrower  will 
throw  darts  that  hit  the  same  spot).Therefore,  an  accurate  and  reliable 
dart  thrower  will  consistently  throw  the  darts  in  the  bull’s  eye.  As  may  be 
evident,  however,  it  is  possible  for  the  dart  thrower  to  be  reliable,  but  not 
accurate.  For  example,  the  dart  thrower  may  throw  all  of  the  darts  in  the 
same  spot  (which  demonstrates  high  reliability),  but  that  spot  may  not  be 
the  bull’s  eye  (which  demonstrates  low  accuracy).  In  the  context  of  mea¬ 
surement,  both  accuracy  and  reliability  are  equally  important. 
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Analyses 

After  conducting  the  study  and  gathering  the  data,  the  next  step  involves 
analyzing  the  data,  which  generally  calls  for  the  use  of  statistical  tech¬ 
niques.  The  type  of  statistical  techniques  used  by  a  researcher  depends  on 
the  design  of  the  study,  the  type  of  data  being  gathered,  and  the  questions 
being  asked.  Although  a  detailed  discussion  of  statistics  is  beyond  the 
scope  of  this  text,  it  is  important  to  be  aware  of  the  role  of  statistics  in  con¬ 
ducting  a  research  study.  In  short,  statistics  help  researchers  minimize  the 
likelihood  of  reaching  an  erroneous  conclusion  about  the  relationship  be¬ 
tween  the  variables  being  studied. 

A  key  decision  that  researchers  must  make  with  the  assistance  of  statis¬ 
tics  is  whether  the  null  hypothesis  should  be  rejected.  Remember  that  the 
null  hypothesis  always  predicts  that  there  will  be  no  difference  between  the 
groups.  Therefore,  rejecting  the  null  hypothesis  means  that  there  is  a  dif¬ 
ference  between  the  groups.  In  general,  most  researchers  seek  to  reject  the 
null  hypothesis  because  rejection  means  the  phenomenon  being  studied 
(e.g.,  exercise,  medication)  had  some  effect. 

It  is  important  to  note  that  there  are  only  two  choices  with  respect  to 
the  null  hypothesis.  Specifically,  the  null  hypothesis  can  be  either  rejected 
or  not  rejected,  but  it  can  never  be  accepted.  If  we  reject  the  null  hypoth¬ 
esis,  we  are  concluding  that  there  is  a  significant  difference  between  the 
groups.  If,  however,  we  do  not  reject  the  null  hypothesis,  then  we  are  con¬ 
cluding  that  we  were  unable  to  detect  a  difference  between  the  groups.  To 
be  clear,  it  does  not  mean  that  there  is  no  difference  between  the  two 
groups.  There  may  in  actuality  have  been  a  significant  difference  between 
the  two  groups,  but  we  were  unable  to  detect  that  difference  in  our  study. 
We  will  talk  more  about  this  important  distinction  in  later  chapters. 

The  decision  of  whether  to  reject  the  null  hypothesis  is  based  on  the 
results  of  statistical  analyses,  and  there  are  two  types  of  errors  that  re¬ 
searchers  must  be  careful  to  avoid  when  making  this  decision — Type  I  er¬ 
rors  and  Type  II  errors.  A  Type  I  error  occurs  when  a  researcher  concludes 
that  there  is  a  difference  between  the  groups  being  studied  when,  in  fact, 
there  is  no  difference.  This  is  sometimes  referred  to  as  a  “false  positive.” 
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By  contrast,  a  Type  II error  occurs  when  the  researcher  concludes  that  there 
is  not  a  difference  between  the  two  groups  being  studied  when,  in  fact, 
there  is  a  difference.  This  is  sometimes  referred  to  as  a  “false  negative.”  As 
previously  noted,  the  conclusion  regarding  whether  there  is  a  difference 
between  the  groups  is  based  on  the  results  of  statistical  analyses.  Specifi¬ 
cally,  with  a  Type  I  error,  although  there  is  a  statistically  significant  result, 
it  occurred  by  chance  (or  error)  and  there  is  not  actually  a  difference  be¬ 
tween  the  two  groups  (Wampold,  Davis,  &  Good,  2003).  With  a  Type  II 
error,  there  is  a  nonsignificant  statistical  result  when,  in  fact,  there  actually 
is  a  difference  between  the  two  groups  (Wampold  et  al.). 

The  typical  convention  in  most  fields  of  science  allows  for  a  5%  chance 
of  erroneously  rejecting  the  null  hypothesis  (i.e.,  of  making  a  Type  I  error). 
In  other  words,  a  researcher  will  conclude  that  there  is  a  significant  differ¬ 
ence  between  the  groups  being  studied  (i.e.,  will  reject  the  null  hypothesis) 
only  if  the  chance  of  being  incorrect  is  less  than  5%.  For  obvious  reasons, 
researchers  want  to  reduce  the  likelihood  of  concluding  that  there  is  a  sig¬ 
nificant  difference  between  the  groups  being  studied  when,  in  fact,  there 
is  not  a  difference. 

The  distinction  between  Type  I  and  Type  II  errors  is  very  important, 
although  somewhat  complicated.  An  example  may  help  to  clarify  these 
terms.  In  our  example,  a  researcher  conducts  a  study  to  determine  whether 
a  new  medication  is  effective  in  treating  depression.  The  new  medication 
is  given  to  Group  1 ,  while  a  placebo  medication  is  given  to  Group  2.  If,  at 
the  conclusion  of  the  study,  the  researcher  concludes  that  there  is  a  signif¬ 
icant  difference  in  levels  of  depression  between  Groups  1  and  2  when,  in 
fact,  there  is  no  difference,  the  researcher  has  made  a  Type  I  error.  In  sim¬ 
pler  terms,  the  researcher  has  detected  a  difference  between  the  groups 
that  in  actuality  does  not  exist;  the  difference  between  the  groups  occurred 
by  chance  (or  error).  By  contrast,  if  the  researcher  concludes  that  there  is 
no  significant  difference  in  levels  of  depression  between  Groups  1  and  2 
when,  in  fact,  there  is  a  difference,  the  researcher  has  made  a  Type  II  er¬ 
ror.  In  simpler  terms,  the  researcher  has  failed  to  detect  a  difference  that 
actually  exists  between  the  groups. 

Which  type  of  error  is  more  serious — Type  I  or  Type  II?  The  answer  to 
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this  question  often  depends  on  the  context  in  which  the  errors  are  made. 
Let’s  use  the  medical  context  as  an  example.  If  a  doctor  diagnoses  a  patient 
with  cancer  when,  in  fact,  the  patient  does  not  have  cancer  (i.e.,  a  false  pos¬ 
itive),  the  doctor  has  committed  a  Type  I  error.  In  this  situation,  it  is  likely 
that  the  erroneous  diagnosis  will  be  discovered  (perhaps  through  a  second 
opinion)  and  the  patient  will  undoubtedly  be  relieved.  If,  however,  the 
doctor  gives  the  patient  a  clean  bill  of  health  when,  in  fact,  the  patient  ac¬ 
tually  has  cancer  (i.e.,  a  false  negative),  the  doctor  has  committed  a  Type  II 
error.  Most  people  would  likely  agree  that  a  Type  II  error  would  be  more 
serious  in  this  example  because  it  would  prevent  the  patient  from  getting 
necessary  medical  treatment. 

You  may  be  wondering  why  researchers  do  not  simply  set  up  their  re¬ 
search  studies  so  that  there  is  even  less  chance  of  making  a  Type  I  error. 
For  example,  wouldn’t  it  make  sense  for  researchers  to  set  up  their  re¬ 
search  studies  so  that  the  chance  of  making  a  Type  I  error  is  less  than  1% 
or,  better  yet,  0%?  The  reason  that  researchers  do  not  set  up  their  studies 
in  this  manner  has  to  do  with  the  relationship  between  making  Type  I  er¬ 
rors  and  making  Type  II  errors.  Specifically,  there  is  an  inverse  relationship 
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Type  I  Errors  vs.  Type  II  Errors 

Type  I  Error  (false  positive):  Concluding  there  is  a  difference  be¬ 
tween  the  groups  being  studied  when,  in  fact,  there  is  no  difference. 
Type  II  Error  (false  negative):  Concluding  there  is  no  difference  be¬ 
tween  the  groups  being  studied  when,  in  fact,  there  is  a  difference. 

Type  I  andType  II  errors  can  be  illustrated  using  the  following  table: 


Actual  Results 

Researcher’s  Conclusion 

Difference 

No  Difference 

Difference 

Correct  decision 

Type  1  error 

No  difference 

Type  II  error 

Correct  decision 
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between  Type  I  errors  and  Type  II  errors,  which  means  that  by  decreasing 
the  probability  of  making  a  Type  I  error,  the  researcher  is  increasing  the 
probability  of  making  a  Type  II  error.  In  other  words,  if  a  researcher  re¬ 
duces  the  probability  of  making  a  Type  I  error  from  5%  to  1%,  there  is 
now  an  increased  probability  that  the  researcher  will  make  a  Type  II  error 
by  failing  to  detect  a  difference  that  actually  exists.  The  5%  level  is  a  stan¬ 
dard  convention  in  most  fields  of  research  and  represents  a  compromise 
between  making  Type  I  and  Type  II  errors. 

Conclusions 

After  analyzing  the  data  and  determining  whether  to  reject  the  null  hy¬ 
pothesis,  the  researcher  is  now  in  a  position  to  draw  some  conclusions 
about  the  results  of  the  study.  For  example,  if  the  researcher  rejected  the 
null  hypothesis,  the  researcher  can  conclude  that  the  phenomenon  being 
studied  had  an  effect — a  statistically  significant  effect,  to  be  more  precise.  If 
the  researcher  rejects  the  null  hypothesis  in  our  exercise-cholesterol  ex¬ 
ample,  the  researcher  is  concluding  that  exercise  had  an  effect  on  levels  of 
cholesterol. 

It  is  important  that  researchers  make  only  those  conclusions  that  can  be 
supported  by  the  data  analyses.  Going  beyond  the  data  is  a  cardinal  sin  that 
researchers  must  be  careful  to  avoid.  For  example,  if  a  researcher  con¬ 
ducted  a  correlational  study  and  the  results  indicated  that  the  two  things 
being  studied  were  strongly  related,  the  researcher  could  not  conclude  that 
one  thing  caused  the  other.  An  oft-repeated  statement  that  will  be  ex¬ 
plained  in  later  chapters  is  that  correlation  (i.e.,  a  relationship  between  two 
things)  does  not  equal  causation.  In  other  words,  the  fact  that  two  things 
are  related  does  not  mean  that  one  caused  the  other. 

Replication 

One  of  the  most  important  elements  of  the  scientific  method  is  replica¬ 
tion.  Replication  essentially  means  conducting  the  same  research  study  a 
second  time  with  another  group  of  participants  to  see  whether  the  same 
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DON'T  FORGET 


Correlation  Does  Not  Equal  Causation 

Before  looking  at  an  example  of  why  correlation  does  not  equal  causa¬ 
tion,  let’s  make  sure  that  we  understand  what  a  correlation  is.  A  correla¬ 
tion  is  simply  a  relationship  between  two  things.  For  example,  size  and 
weight  are  often  correlated  because  there  is  a  relationship  between  the 
size  of  something  and  its  weight.  Specifically,  bigger  things  tend  to  weigh 
more. The  results  of  correlational  studies  simply  provide  researchers  with 
information  regarding  the  relationship  between  two  or  more  variables, 
which  may  serve  as  the  basis  for  future  studies.  It  is  important,  however, 
that  researchers  interpret  this  relationship  cautiously. 

For  example,  if  a  researcher  finds  that  eating  ice  cream  is  correlated  with 
(i.e.,  related  to)  higher  rates  of  drowning,  the  researcher  cannot  conclude 
that  eating  ice  cream  causes  drowning.  It  may  be  that  another  variable  is 
responsible  for  the  higher  rates  of  drowning.  For  example,  most  ice  cream 
is  eaten  in  the  summer  and  most  swimming  occurs  in  the  summer.There- 
fore,  the  higher  rates  of  drowning  are  not  caused  by  eating  ice  cream,  but 
rather  by  the  increased  number  of  people  who  swim  during  the  summer. 


results  are  obtained  (see  Kazdin,  1992;  Shaughnessy  &  Zechmeister, 
1997).  The  same  researcher  may  attempt  to  replicate  previously  obtained 
results,  or  perhaps  other  researchers  may  undertake  that  task.  Replication 
illustrates  an  important  point  about  scientific  research — namely,  that  re¬ 
searchers  should  avoid  drawing  broad  conclusions  based  on  the  results  of 
a  single  research  study  because  it  is  always  possible  that  the  results  of  that 
particular  study  were  an  aberration.  In  other  words,  it  is  possible  that  the 
results  of  the  research  study  were  obtained  by  chance  or  error  and,  there¬ 
fore,  that  the  results  may  not  accurately  represent  the  actual  state  of  things. 
However,  if  the  results  of  a  research  study  are  obtained  a  second  time  (i.e., 
replicated),  the  likelihood  that  the  original  study’s  findings  were  obtained 
by  chance  or  error  is  greatly  reduced. 

The  importance  of  replication  in  research  cannot  be  overstated.  Repli¬ 
cation  serves  several  integral  purposes,  including  establishing  the  reliabil¬ 
ity  (i.e.,  consistency)  of  the  research  study’s  findings  and  determining 
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whether  the  same  results  can  be  obtained  with  a  different  group  of  partic¬ 
ipants.  This  last  point  refers  to  whether  the  results  of  the  original  study 
are  generali^able  to  other  groups  of  research  participants.  If  the  results  of 
a  study  are  replicated,  the  researchers — and  the  field  in  which  the  re¬ 
searchers  work — can  have  greater  confidence  in  the  reliability  and  gener- 
alizability  of  the  original  findings. 

GOALS  OF  SCIENTIFIC  RESEARCH 

As  stated  previously,  the  goals  of  scientific  research,  in  broad  terms,  are  to 
answer  questions  and  acquire  new  knowledge.  This  is  typically  accom¬ 
plished  by  conducting  research  that  permits  drawing  valid  inferences 
about  the  relationship  between  two  or  more  variables  (Kazdin,  1992).  In 
later  chapters,  we  discuss  the  specific  techniques  that  researchers  use  to 
ensure  that  valid  inferences  can  be  drawn  from  their  research,  and  in  Rapid 
References  1 .6  and  1 .7  we  present  some  research-related  terms  you  should 
become  familiar  with.  For  now,  however,  our  main  discussion  will  focus 
on  the  goals  of  scientific  research  in  more  general  terms.  Most  researchers 
agree  that  the  three  general  goals  of  scientific  research  are  description, 
prediction,  and  understanding/explanation  (Cozby,  1993;  Shaughnessy  & 
Zechmeister,  1997). 

Description 

Perhaps  the  most  basic  and  easily  understood  goal  of  scientific  research  is 
description.  In  short,  description  refers  to  the  process  of  defining,  classify¬ 
ing,  or  categorizing  phenomena  of  interest.  For  example,  a  researcher  may 
wish  to  conduct  a  research  study  that  has  the  goal  of  describing  the  rela¬ 
tionship  between  two  things  or  events,  such  as  the  relationship  between 
cardiovascular  exercise  and  levels  of  cholesterol.  Alternatively,  a  re¬ 
searcher  may  be  interested  in  describing  a  single  phenomenon,  such  as  the 
effects  of  stress  on  decision  making. 

Descriptive  research  is  useful  because  it  can  provide  important  infor¬ 
mation  regarding  the  average  member  of  a  group.  Specifically,  by  gather- 
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Categories  of  Research 

There  are  two  broad  categories  of  research  with  which  researchers  must 
be  familiar. 

Quantitative  vs.  Qualitative 

•  Quantitative  research  involves  studies  that  make  use  of  statistical  analy¬ 
ses  to  obtain  their  findings.  Key  features  include  formal  and  systematic 
measurement  and  the  use  of  statistics. 

•  Qualitative  research  involves  studies  that  do  not  attempt  to  quantify 
their  results  through  statistical  summary  or  analysis.  Qualitative  studies 
typically  involve  interviews  and  observations  without  formal  measure¬ 
ment.  A  case  study,  which  is  an  in-depth  examination  of  one  person,  is 
a  form  of  qualitative  research.  Qualitative  research  is  often  used  as  a 
source  of  hypotheses  for  later  testing  in  quantitative  research. 

Nomothetic  vs.  Idiographic 

•  The  nomothetic  approach  uses  the  study  of  groups  to  identify  general 
laws  that  apply  to  a  large  group  of  people. The  goal  is  often  to  identify 
the  average  member  of  the  group  being  studied  or  the  average  perfor¬ 
mance  of  a  group  member. 

•  The  idiographic  approach  is  the  study  of  an  individual.  An  example  of 
the  idiographic  approach  is  the  aforementioned  case  study. 

The  choice  of  which  research  approaches  to  use  largely  depends  on  the 
types  of  questions  being  asked  in  the  research  study,  and  different  fields  of 
research  typically  rely  on  different  categories  of  research  to  achieve  their 
goals.  Social  science  research,  for  example,  typically  relies  on  quantitative 
research  and  the  nomothetic  approach.  In  other  words,  social  scientists 
study  large  groups  of  people  and  rely  on  statistical  analyses  to  obtain  their 
findings. These  two  broad  categories  of  research  will  be  the  primary  focus 
of  this  book. 


ing  data  on  a  large  enough  group  of  people,  a  researcher  can  describe  the 
average  member,  or  the  average  performance  of  a  member,  of  the  partic¬ 
ular  group  being  studied.  Perhaps  a  brief  example  will  help  clarify  what  we 
mean  by  this.  Let’s  say  a  researcher  gathers  Scholastic  Aptitude  Test  (SAT) 
scores  from  the  current  freshman  class  at  a  prestigious  university.  By 
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Sample  vs.  Population 

Two  key  terms  that  you  must  be  familiar  with  are  "sample”  and  “popula- 
tion.'The  population  is  all  individuals  of  interest  to  the  researcher.  For  ex¬ 
ample,  a  researcher  may  be  interested  in  studying  anxiety  among  lawyers; 
in  this  example,  the  population  is  all  lawyers.  For  obvious  reasons,  re¬ 
searchers  are  typically  unable  to  study  the  entire  population.  In  this  case  it 
would  be  difficult,  if  not  impossible,  to  study  anxiety  among  all  lawyers. 
Therefore,  researchers  typically  study  a  subset  of  the  population,  and  that 
subset  is  called  a  sample. 

Because  researchers  may  not  be  able  to  study  the  entire  population  of  in¬ 
terest,  it  is  important  that  the  sample  be  representative  of  the  population 
from  which  it  was  selected.  For  example,  the  sample  of  lawyers  the  re¬ 
searcher  studies  should  be  similar  to  the  population  of  lawyers.  If  the  pop¬ 
ulation  of  lawyers  is  composed  mainly  of  White  men  over  the  age  of  35, 
studying  a  sample  of  lawyers  composed  mainly  of  Black  women  under  the 
age  of  30  would  obviously  be  problematic  because  the  sample  is  not  rep¬ 
resentative  of  the  population.  Studying  a  representative  sample  permits 
the  researcherto  draw  valid  inferences  about  the  population.  In  other 
words,  when  a  researcher  uses  a  representative  sample,  if  something  is 
true  of  the  sample,  it  is  likely  also  true  of  the  population. 


using  some  simple  statistical  techniques,  the  researcher  would  be  able  to 
calculate  the  average  SAT  score  for  the  current  college  freshman  at  the 
university.  This  information  would  likely  be  informative  for  high  school 
students  who  are  considering  applying  for  admittance  at  the  university. 

One  example  of  descriptive  research  is  correlational  research.  In  corre¬ 
lational  research  (as  mentioned  earlier),  the  researcher  attempts  to  determine 
whether  there  is  a  relationship — that  is,  a  correlation — between  two  or 
more  variables  (see  Rapid  Reference  1 .8  for  two  types  of  correlation).  For 
example,  a  researcher  may  wish  to  determine  whether  there  is  a  relation¬ 
ship  between  SAT  scores  and  grade-point  averages  (GPAs)  among  a 
sample  of  college  freshmen.  The  many  uses  of  correlational  research  will 
be  discussed  in  later  chapters. 
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Two  Types  of  Correlation 

Positive  correlation:  A  positive  correlation  between  two  variables 
means  that  both  variables  change  in  the  same  direction  (either  both  in¬ 
crease  or  both  decrease).  For  example,  if  GPAs  increase  as  SAT  scores 
increase,  there  is  a  positive  correlation  between  SAT  scores  and  GPAs. 

Negative  (inverse)  correlation:  A  negative  correlation  between  two 
variables  means  that  as  one  variable  increases,  the  other  variable  de¬ 
creases.  In  other  words,  the  variables  change  in  opposite  directions.  So,  if 
GPAs  decrease  as  SAT  scores  increase,  there  is  a  negative  correlation 
between  SAT  scores  and  GPAs. 

Prediction 

Another  broad  goal  of  research  is  prediction.  Prediction-based  research 
often  stems  from  previously  conducted  descriptive  research.  If  a  re¬ 
searcher  finds  that  there  is  a  relationship  (i.e.,  correlation)  between  two 
variables,  then  it  may  be  possible  to  predict  one  variable  from  knowledge 
of  the  other  variable.  For  example,  if  a  researcher  found  that  there  is  a  re¬ 
lationship  between  SAT  scores  and  GPAs,  knowledge  of  the  SAT  scores 
alone  would  allow  the  researcher  to  predict  the  associated  GPAs. 

Many  important  questions  in  both  science  and  the  so-called  real  world 
involve  predicting  one  thing  based  on  knowledge  of  something  else.  For 
example,  college  admissions  boards  may  attempt  to  predict  success  in  col¬ 
lege  based  on  the  GPAs  and  SAT  scores  of  the  applicants.  Employers  may 
attempt  to  predict  j  ob  success  based  on  work  samples,  test  scores,  and  can¬ 
didate  interviews.  Psychologists  may  attempt  to  predict  whether  a  trau¬ 
matic  life  event  leads  to  depression.  Medical  doctors  may  attempt  to  pre¬ 
dict  what  levels  of  obesity  and  high  blood  pressure  are  associated  with 
cardiovascular  disease  and  stroke.  Meteorologists  may  attempt  to  predict 
the  amount  of  rain  based  on  the  temperature,  barometric  pressure,  hu¬ 
midity,  and  weather  patterns.  In  each  of  these  examples,  a  prediction  is  be¬ 
ing  made  based  on  existing  knowledge  of  something  else. 
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Understanding/Explanation 

Being  able  to  describe  something  and  having  the  ability  to  predict  one 
thing  based  on  knowledge  of  another  are  important  goals  of  scientific 
research,  but  they  do  not  provide  researchers  with  a  true  understanding  of 
a  phenomenon.  One  could  argue  that  true  understanding  of  a  phenome¬ 
non  is  achieved  only  when  researchers  successfully  identify  the  cause  or 
causes  of  the  phenomenon.  For  example,  being  able  to  predict  a  student’s 
GPA  in  college  based  on  his  or  her  SAT  scores  is  important  and  very  prac¬ 
tical,  but  there  is  a  limit  to  that  knowledge.  The  most  important  limitation 
is  that  a  relationship  between  two  things  does  not  permit  an  inference  of 
causality.  In  other  words,  the  fact  that  two  things  are  related  and  knowl¬ 
edge  of  one  thing  (e.g.,  SAT  scores)  leads  to  an  accurate  prediction  of  the 
other  thing  (e.g.,  GPA)  does  not  mean  that  one  thing  caused  the  other.  For 
example,  a  relationship  between  SAT  scores  and  freshman  GPAs  does  not 
mean  that  the  SAT  scores  caused  the  freshman-year  GPAs.  More  than 
likely,  the  SAT  scores  are  indicative  of  other  things  that  may  be  more 
directly  responsible  for  the  GPAs.  For  example,  the  students  who  score 
high  on  the  SAT  may  also  be  the  students  who  spend  a  lot  of  time  study¬ 
ing,  and  it  is  likely  the  amount  of  time  studying  that  is  the  cause  of  a  high 
GPA. 

The  ability  of  researchers  to  make  valid  causal  inferences  is  determined 
by  the  type  of  research  designs  they  use.  Correlational  research,  as  previ¬ 
ously  noted,  does  not  permit  researchers  to  make  causal  inferences  regard¬ 
ing  the  relationship  between  the  two  things  that  are  correlated.  By  contrast, 
a  randomized  controlled  study,  which  will  be  discussed  in  detail  in  Chapter 
5,  permits  researchers  to  make  valid  cause-and-effect  inferences. 

There  are  three  prerequisites  for  drawing  an  inference  of  causality  be¬ 
tween  two  events  (see  Shaughnessy  &  Zechmeister,  1997).  First,  there 
must  be  a  relationship  (i.e.,  a  correlation)  between  the  two  events.  In  other 
words,  the  events  must  covary — as  one  changes,  the  other  must  also 
change.  If  two  events  do  not  covary,  then  a  researcher  cannot  conclude 
that  one  event  caused  the  other  event.  For  example,  if  there  is  no  relation¬ 
ship  between  television  viewing  and  deterioration  of  eyesight,  then  one 
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cannot  reasonably  conclude  that  television  viewing  causes  a  deterioration 
of  eyesight. 

Second,  one  event  (the  cause)  must  precede  the  other  event  (the  effect). 
This  is  sometimes  referred  to  as  a  time-order  relationship.  This  should  make 
intuitive  sense.  Obviously,  if  two  events  occur  simultaneously,  it  cannot  be 
concluded  that  one  event  caused  the  other.  Similarly,  if  the  observed  effect 
comes  before  the  presumed  cause,  it  would  make  litde  sense  to  conclude 
that  the  cause  caused  the  effect. 

Third,  alternative  explanations  for  the  observed  relationship  must  be 
ruled  out.  This  is  where  it  gets  tricky.  Stated  another  way,  a  causal  expla¬ 
nation  between  two  events  can  be  accepted  only  when  other  possible 
causes  of  the  observed  relationship  have  been  ruled  out.  An  example  may 
help  to  clarify  this  last  required  condition  for  causality.  Let’s  say  that  a 
researcher  is  attempting  to  study  the  effects  of  two  different  psychothera¬ 
pies  on  levels  of  depression.  The  researcher  first  obtains  a  representative 
sample  of  people  with  the  same  level  of  depression  (as  measured  by  a  valid 
and  reliable  measure)  and  then  randomly  assigns  them  to  one  of  two 
groups.  Group  1  will  get  Therapy  A  and  Group  2  will  get  Therapy  B.  The 
obvious  goal  is  to  compare  levels  of  depression  in  both  groups  after  pro¬ 
viding  the  therapy.  It  would  be  unwise  in  this  situation  for  the  researcher 
to  assign  all  of  the  participants  under  age  30  to  Group  1  and  all  of  the  par¬ 
ticipants  over  age  30  to  Group  2:  If,  at  the  conclusion  of  the  study,  Group 
1  and  Group  2  differed  signifi¬ 
cantly  in  levels  of  depression,  the 
researcher  would  be  unable  to  de¬ 
termine  which  variable — type  of 
therapy  or  age — was  responsible 
for  the  reduced  depression.  We 
would  say  that  this  research  has 
been  confounded,  which  means  that 
two  variables  (in  this  case,  the  type 
of  therapy  and  age)  were  allowed 
to  vary  (or  be  different)  at  the 
same  time.  Ideally,  only  the  vari- 
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Prerequisites  for 
Inferences  of  Causality 

•  There  must  be  an  existing  rela¬ 
tionship  between  two  events. 

•  The  cause  must  precede  the  ef¬ 
fect. 

•  Alternative  explanations  for  the 
relationship  must  be  ruled  out. 
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able  being  studied  (e.g.,  the  type  of  therapy)  will  differ  between  the  two 
groups. 

OVERVIEW  OF  THE  BOOK 

The  focus  of  this  book  is,  obviously,  research  design  and  methodology. 
Although  these  terms  are  sometimes  incorrecdy  used  interchangeably, 
they  are  distinct  concepts  with  well-defined  and  circumscribed  meanings. 
Therefore,  before  proceeding  any  further,  it  would  behoove  us  to  define 
these  terms,  at  least  temporarily.  As  defined  by  Kazdin  (1992,  2003a),  a 
recognized  leader  in  the  field  of  research,  methodology  refers  to  the  prin¬ 
ciples,  procedures,  and  practices  that  govern  research,  whereas  research  de¬ 
sign  refers  to  the  plan  used  to  examine  the  question  of  interest.  “Method- 
ology”  should  be  thought  of  as  encompassing  the  entire  process  of 
conducting  research  (i.e.,  planning  and  conducting  the  research  study, 
drawing  conclusions,  and  disseminating  the  findings).  By  contrast,  “re¬ 
search  design”  refers  to  the  many  ways  in  which  research  can  be  con¬ 
ducted  to  answer  the  question  being  asked.  These  concepts  will  become 
clearer  throughout  this  book,  but  it  is  important  that  you  understand  the 
focus  of  this  book  before  reading  any  further. 

Essentials  of  Research  Design  and  Methodology  succinctly  covers  all  of  the 
major  topic  areas  within  research  design  and  methodology.  Each  chapter 
in  this  book  covers  a  specific  research-related  topic  using  easy-to- 
understand  language  and  illustrative  examples.  The  book  is  not  meant, 
however,  to  replace  the  very  extensive  and  comprehensive  coverage  of  re¬ 
search  issues  that  can  be  found  in  other  publications.  For  those  readers 
who  would  like  a  more  in-depth  understanding  of  the  specific  topic  areas 
covered  in  this  book,  we  would  suggest  looking  to  the  publications  in¬ 
cluded  in  the  reference  list  at  the  end  of  this  book.  Finally,  although  each 
chapter  builds  upon  the  knowledge  obtained  from  the  previous  chapters, 
each  chapter  can  also  be  used  as  a  stand-alone  summary  of  the  important 
points  within  that  topic  area.  For  this  reason,  we  occasionally  cover  some 
of  the  same  material  in  more  than  one  chapter. 

The  chapters  in  Essentials  of  Research  Design  and  Methodology  are  organized 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


INTRODUCTION  AND  OVERVIEW  23 


in  a  manner  that  accurately  reflects  the  logical  flow  of  a  research  project 
from  development  to  conclusion.  The  first  three  chapters  lay  the  founda¬ 
tion  for  conducting  a  research  project.  This  chapter  introduced  you  to 
some  of  the  key  concepts  relating  to  science,  research  design,  and  method¬ 
ology.  As  will  be  discussed,  at  a  basic  level,  the  first  step  in  conducting 
research  involves  coming  up  with  an  idea  and  translating  that  idea  into  a 
testable  question  or  statement.  Chapter  2  discusses  these  preliminary 
stages  of  research,  including  choosing  a  research  idea,  formulating  a  re¬ 
search  problem,  choosing  appropriate  independent  and  dependent  vari¬ 
ables,  and  selecting  a  sample  of  participants  for  your  study.  As  every  re¬ 
searcher  knows,  coming  up  with  a  well-designed  research  study  can  be  a 
challenging  process,  but  the  importance  of  that  task  cannot  be  overstated. 
Chapter  3  discusses  some  of  the  more  common  pitfalls  faced  by  re¬ 
searchers  when  thinking  about  the  design  of  a  research  study. 

After  a  research  question  has  been  formulated,  researchers  must 
choose  a  research  design,  collect  and  analyze  the  data,  and  draw  some  con¬ 
clusions.  Chapter  4  will  introduce  you  to  the  common  measurement  issues 
and  strategies  that  must  be  considered  when  designing  a  research  study. 
Chapter  5  will  present  a  concise  summary  of  the  most  common  types  of 
research  designs  that  are  available  to  researchers;  as  will  be  discussed,  the 
type  of  research  design  chosen  for  a  particular  study  depends  largely  on 
the  question  being  asked.  Chapter  6  will  focus  on  one  of  the  most  impor¬ 
tant  considerations  in  all  of  research — validity.  Put  simply,  validity  refers  to 
the  soundness  of  the  research  design  being  used,  with  high  validity  typi¬ 
cally  producing  more  accurate  and  meaningful  results.  Validity  comes  in 
many  forms,  and  Chapter  6  will  discuss  each  one  and  how  to  maximize  it 
in  the  course  of  research.  Chapter  7  will  introduce  you  to  many  of  the  is¬ 
sues  faced  by  researchers  when  analyzing  data  and  attempting  to  draw 
conclusions  based  on  the  data. 

Most  research  is  subject  to  oversight  by  one  or  more  ethical  review 
committees,  such  as  a  university-based  institutional  review  board.  These 
committees  are  charged  with  the  important  task  of  reviewing  all  proposed 
research  studies  to  ensure  that  they  comply  with  applicable  regulations 
governing  research,  which  may  be  established  by  the  university,  the  city, 
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the  state,  or  the  federal  government,  depending  on  the  nature  of  the  re¬ 
search  being  conducted.  Knowledge  of  the  commonly  encountered  ethi¬ 
cal  issues  will  assist  researchers  in  avoiding  ethical  violations  and  resolving 
ethical  dilemmas.  To  this  end,  Chapter  8  will  focus  on  the  most  commonly 
encountered  ethical  issues  faced  by  researchers  when  designing  and  con¬ 
ducting  a  research  study.  Among  other  things,  Chapter  8  will  focus  on  the 
important  topic  of  informed  consent  to  research. 

Finally,  Chapter  9  will  present  a  brief  section  on  the  dissemination  of 
research  results,  including  publication  in  peer-reviewed  journals  and  pre¬ 
sentations  at  professional  conferences.  Chapter  9  will  include  a  distillation 
of  major  principles  of  research  design  and  methodology  that  are  appli¬ 
cable  for  those  conducting  research  in  a  variety  of  capacities  and  settings. 
Chapter  9  will  conclude  by  presenting  a  checklist  of  the  major  research- 
related  concepts  and  considerations  covered  throughout  this  book. 

Before  concluding  this  chapter,  one  word  of  caution  is  necessary  re¬ 
garding  the  focus  of  this  book.  As  stated  previously,  research  studies  come 
in  many  different  forms,  depending  on  the  scientific  discipline  within 
which  the  research  is  being  conducted.  For  example,  most  research  stud¬ 
ies  in  the  field  of  quantum  physics  take  place  in  a  laboratory  and  do  not  in¬ 
volve  human  participants.  Contrast  this  with  the  research  studies  that  are 
conducted  by  social  scientists,  which  may  often  take  place  in  real-world 
settings  and  involve  human  participants.  For  the  sake  of  clarity,  consis¬ 
tency,  and  ease  of  reading,  we  thought  that  it  was  necessary  to  narrow  the 
focus  of  this  book  to  one  broad  type  of  research.  Therefore,  throughout 
this  book,  we  will  focus  primarily  on  empirical  research  involving  human 
participants,  which  is  most  commonly  found  in  the  social  and  behavioral 
sciences.  Focusing  on  this  type  of  research  permits  us  to  explore  a  wider 
range  of  research-related  considerations  that  must  be  addressed  by  re¬ 
searchers  across  many  scientific  disciplines. 
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TEST  YOURSELF 


1 .  _ can  be  defined  as  a  methodological  and  systematic  ap¬ 

proach  to  the  acquisition  of  new  knowledge. 

2.  The  defining  characteristic  of  scientific  research  is  the _ 

3.  The _ approach  relies  on  direct  observation  and  experimen¬ 

tation  in  the  acquisition  of  new  knowledge. 

4.  Scientists  define  key  concepts  and  terms  in  the  context  of  their  research 

studies  by  using _ definitions. 

5.  What  are  the  three  general  goals  of  scientific  research? 

Answers:  I .  Science;  2.  scientific  method;  3.  empirical;  4.  operational;  5.  description,  predic¬ 
tion,  and  understanding/explaining 
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Two 


PLANNING  AND  DESIGNING  A 
RESEARCH  STUDY 


As  discussed  in  Chapter  1,  engaging  in  research  can  be  an  exciting 
and  rewarding  endeavor.  Through  research,  scientists  attempt  to 
answer  age-old  questions,  acquire  new  knowledge,  describe  how 
things  work,  and  ultimately  improve  the  way  we  all  live.  Despite  the  excit¬ 
ing  and  rewarding  nature  of  research,  deciding  to  conduct  a  research  study 
can  be  intimidating  for  both  inexperienced  and  experienced  researchers 
alike.  Novice  researchers  are  frequently  surprised — and  often  over¬ 
whelmed — by  the  sheer  number  of  decisions  that  need  to  be  made  in  the 
context  of  a  research  study.  Depending  on  the  scope  and  complexity  of  the 
research  study  being  considered,  there  are  typically  dozens  of  research- 
related  issues  that  need  to  be  addressed  in  the  planning  stage  alone.  As  a 
result,  the  early  stages  of  planning  a  research  study  can  often  seem  over¬ 
whelming  for  novice  researchers  with  little  experience  (and  even  for  sea¬ 
soned  researchers  with  considerable  experience,  although  they  may  not 
always  freely  admit  it). 

As  will  become  clear  throughout  this  chapter,  much  of  the  work  in¬ 
volved  in  conducting  a  research  study  actually  takes  place  prior  to  con¬ 
ducting  the  study  itself.  All  too  often,  novice  researchers  underestimate 
the  amount  of  preparatory  groundwork  that  needs  to  be  accomplished 
prior  to  collecting  any  data.  Although  the  preliminary  work  of  getting  a  re¬ 
search  study  started  differs  depending  on  the  type  of  research  being  con¬ 
ducted,  there  are  some  research-related  issues  that  are  common  to  most 
types  of  research.  For  example,  prior  to  collecting  any  data  at  all,  re¬ 
searchers  must  typically  identify  a  topic  area  of  interest,  conduct  a  litera- 
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ture  review,  formulate  a  researchable  question,  articulate  hypotheses,  de¬ 
termine  who  or  what  will  be  studied,  identify  the  independent  and  depen¬ 
dent  variables  that  will  be  examined  in  the  study,  and  choose  an  appropri¬ 
ate  research  methodology.  And  these  are  just  a  few  of  the  more  common 
research-related  issues  encountered  by  researchers.  Furthermore,  de¬ 
pending  on  the  context  in  which  the  research  is  taking  place,  there  may  be 
a  push  to  get  the  research  study  started  sooner  rather  than  later,  which  may 
further  contribute  to  the  researcher’s  feeling  overwhelmed  during  the 
planning  stage  of  a  research  study. 

In  addition  to  these  research-related  issues,  researchers  may  also  need 
to  consider  several  logistical  and  administrative  issues.  Administrative  and 
logistical  issues  include  things  such  as  who  is  paying  for  the  research, 
whether  research  staff  need  to  be  hired,  where  and  when  the  research 
study  will  be  conducted,  and  what  approvals  need  to  be  obtained  (and 
from  whom)  to  conduct  the  research  study.  And  this  is  just  a  small  sam¬ 
pling  of  the  preliminary  issues  that  researchers  need  to  address  during  the 
planning  stage  of  a  research  study. 

The  purpose  of  this  chapter  is  to  introduce  you  to  this  planning  stage. 
Because  research  studies  differ  gready,  both  in  terms  of  scope  and  con¬ 
tent,  this  chapter  cannot  possibly  address  all  of  the  issues  that  need  to  be 
considered  when  planning  and  designing  a  research  study.  Instead,  this 
chapter  will  focus  on  the  research-related  issues  that  are  most  commonly 
encountered  by  researchers  in  all  scientific  fields  (particularly  those  that 
involve  human  participants)  when  planning  and  designing  a  research 
study.  In  some  ways,  you  can  think  of  this  chapter  as  a  checklist  of  the  ma¬ 
jor  research-related  issues  that  need  to  be  considered  during  the  planning 
stage.  Although  some  of  the  topics  discussed  in  this  chapter  may  not  be 
applicable  in  the  context  of  your  particular  research,  it  is  important  for  you 
to  be  aware  of  these  issues.  After  discussing  how  researchers  typically  se¬ 
lect  the  topics  that  they  study,  this  chapter  will  discuss  literature  reviews, 
the  formulation  of  research  problems,  the  development  of  testable  hy¬ 
potheses,  the  identification  and  operationalization  of  independent  and  de¬ 
pendent  variables,  and  the  selection  and  assignment  of  research  partici- 
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pants.  Finally,  this  chapter  will  conclude  with  a  discussion  of  the  impact  of 
multicultural  issues  on  research. 


CHOOSING  A  RESEARCH  TOPIC 

The  first  step  in  designing  any  research  study  is  deciding  what  to  study. 
Researchers  choose  the  topics  that  they  study  in  a  variety  of  ways,  and  their 
decisions  are  necessarily  influenced  by  several  factors.  For  example, 
choosing  a  research  topic  will  obviously  be  largely  influenced  by  the  sci¬ 
entific  field  within  which  the  researcher  works.  As  you  know,  “science”  is 
a  broad  term  that  encompasses  numerous  specialized  and  diverse  areas  of 
study,  such  as  biology,  physics,  psychology,  anthropology,  medicine,  and 
economics,  just  to  name  a  few.  Researchers  achieve  competence  in  their 
particular  fields  of  study  through  a  combination  of  training  and  experi¬ 
ence,  and  it  typically  takes  many  years  to  develop  an  area  of  expertise. 

As  you  can  probably  imagine,  it  would  be  quite  difficult  for  a  researcher 
in  one  scientific  field  to  undertake  a  research  study  involving  a  topic  in  an 
entirely  different  scientific  field.  For  example,  it  is  highly  unlikely  that  a 
botanist  would  choose  to  study  quantum  physics  or  macroeconomics.  In 
addition  to  his  or  her  lacking  the  training  and  experience  necessary  for 
studying  quantum  physics  or  macroeconomics,  it  is  probably  reasonable 
to  conclude  that  the  botanist  does  not  have  an  interest  in  conducting 
research  studies  in  those  areas.  So,  assuming  that  researchers  have  the 
proper  training  and  experience  to  conduct  research  studies  in  their  re¬ 
spective  fields,  let’s  turn  our  attention  to  how  researchers  choose  the  top¬ 
ics  that  they  study  (see  Christensen,  2001;  Kazdin,  1992). 

Interest 

First  and  foremost,  researchers  typically  choose  research  topics  that  are  of 
interest  to  them.  Although  this  may  seem  like  common  sense,  it  is  impor¬ 
tant  to  occasionally  remind  ourselves  that  researchers  engage  in  research 
presumably  because  they  have  a  genuine  interest  in  the  topics  that  they 
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study.  A  good  question  to  ask  at  this  point  is  how  research  interests  de¬ 
velop  in  the  first  place.  There  are  several  answers  to  this  question. 

Many  researchers  entered  their  chosen  fields  of  study  with  long¬ 
standing  interests  in  those  particular  fields.  For  example,  a  psychologist 
may  have  decided  to  become  a  researcher  because  of  a  long-standing  in¬ 
terest  in  how  childhood  psychopathology  develops  or  how  anxiety  disor¬ 
ders  can  be  effectively  treated  with  psychotropic  medications.  For  other 
researchers,  they  may  have  entered  their  chosen  fields  of  study  with  spe¬ 
cific  interests,  and  then  perhaps  refined  those  interests  over  the  course  of 
their  careers.  Further,  as  many  researchers  will  attest,  it  is  certainly  not 
uncommon  for  researchers  to  develop  new  interests  throughout  their 
careers.  Through  the  process  of  conducting  research,  as  well  as  the  long 
hours  that  are  spent  reviewing  other  people’s  research,  researchers  can 
often  stumble  onto  new  and  often  unanticipated  research  ideas. 

Regardless  of  whether  researchers  enter  their  chosen  fields  with  spe¬ 
cific  interests  or  develop  new  interests  as  they  go  along,  many  researchers 
become  interested  in  particular  research  ideas  simply  by  observing  the 
world  around  them  (as  discussed  in  Chapter  1).  Merely  taking  an  interest 
in  a  specific  observed  phenomenon  is  the  impetus  for  a  great  amount  of 
research  in  all  fields  of  study.  In  summary,  a  researcher’s  basic  curiosity 
about  an  observed  phenomenon  typically  provides  sufficient  motivation 
for  choosing  a  research  topic. 

Problem  Solving 

Some  research  ideas  may  also  stem  from  a  researcher’s  motivation  to  solve 
a  particular  problem.  In  both  our  private  and  professional  lives,  we  have 
probably  all  come  across  some  situation  or  thing  that  has  caught  our  at¬ 
tention  as  being  in  need  of  change  or  improvement.  For  example,  a  great 
deal  of  research  is  currently  being  conducted  to  make  work  environments 
less  stressful,  diets  healthier,  and  automobiles  safer.  In  each  of  these  re¬ 
search  studies,  researchers  are  attempting  to  solve  some  specific  problem, 
such  as  work-related  stress,  obesity,  or  dangerous  automobiles.  This  type 
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of  problem-solving  research  is  often  conducted  in  corporate  and  profes¬ 
sional  settings,  primarily  because  the  results  of  these  types  of  research 
studies  typically  have  the  added  benefit  of  possessing  practical  utility.  For 
example,  finding  ways  for  employers  to  reduce  the  work-related  stress  of 
employees  could  potentially  result  in  increased  levels  of  employee  pro¬ 
ductivity  and  satisfaction,  which  in  turn  could  result  in  increased  eco¬ 
nomic  growth  for  the  organization.  These  types  of  benefits  are  likely  to  be 
of  great  interest  to  most  corporations  and  businesses. 

Previous  Research 

Researchers  also  choose  research  topics  based  on  the  results  of  prior  re¬ 
search,  whether  conducted  by  them  or  by  someone  else.  Researchers  will 
likely  attest  that  previously  conducted  research  is  a  rich  and  plentiful 
source  of  research  ideas.  Through  exposure  to  the  results  of  research  stud¬ 
ies,  which  are  typically  published  in  peer-reviewed  journals  (see  Chapter  9 
for  a  discussion  of  publishing  the  results  of  research  studies),  a  researcher 
may  develop  a  research  interest  in  a  particular  area.  For  example,  a  sociol¬ 
ogist  who  primarily  studies  the  socialization  of  adolescents  may  take  an  in¬ 
terest  in  studying  the  related  phenomenon  of  adolescent  gang  behavior 
after  being  exposed  to  research  studies  on  that  topic.  In  these  instances, 
researchers  may  attempt  to  replicate  the  results  obtained  by  the  other  re¬ 
searchers  or  perhaps  extend  the  findings  of  the  previous  research  to  dif¬ 
ferent  populations  or  settings.  As  noted  by  Kazdin  (1992),  a  large  portion 
of  research  stems  from  researchers’  efforts  to  build  upon,  expand,  or  re¬ 
explain  the  results  of  previously  conducted  research  studies.  In  fact,  it  is 
often  quipped  that  “research  begets  research,”  primarily  because  research 
tends  to  raise  more  questions  than  it  answers,  and  those  newly  raised  ques¬ 
tions  often  become  the  focus  of  future  research  studies. 

Theory 

Finally,  theories  (see  Rapid  Reference  2.1  for  a  definition)  often  serve  as  a 
good  source  for  research  ideas.  Theories  can  serve  several  purposes,  but 
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in  the  research  context,  they  typi¬ 
cally  function  as  a  rich  source  of 
hypotheses  that  can  be  examined 
empirically.  This  brings  us  to  an 
important  point  that  should  not 
be  glossed  over — specifically,  that 
research  ideas  (and  the  hypothe¬ 
ses  and  research  designs  that  fol¬ 
low  from  those  ideas)  should  be 
based  on  some  theory  (Serlin, 

1987).  For  example,  a  researcher  may  have  a  theory  regarding  the  devel¬ 
opment  of  depression  among  elderly  males.  In  this  example,  the  re¬ 
searcher  may  theorize  that  elderly  males  become  depressed  due  to  their 
reduced  ability  to  engage  in  enjoyable  physical  activities.  This  hypothetical 
theory,  like  most  other  theories,  makes  a  prediction.  In  this  instance,  the 
theory  makes  a  specific  prediction  about  what  causes  depression  among 
elderly  males.  The  predictions  suggested  by  theories  can  often  be  trans¬ 
formed  into  testable  hypotheses  that  can  then  be  examined  empirically  in 
the  context  of  a  research  study. 

In  the  preceding  paragraphs,  we  have  only  briefly  touched  upon  several 
possible  sources  for  research  ideas.  There  are  obviously  many  more 
sources  we  could  have  discussed,  but  space  limitations  preclude  us  from 
entering  into  a  full  discourse  on  this  topic.  The  important  point  to  re¬ 
member  from  this  discussion  is  that  research  ideas  can — and  do — come 
from  a  variety  of  different  sources,  many  of  which  we  commonly  en¬ 
counter  in  our  daily  lives. 

Throughout  this  discussion,  you  may  have  noticed  that  we  have  not 
commented  on  the  quality  of  the  research  idea.  Instead,  we  have  limited 
our  discussion  thus  far  to  how  researchers  choose  research  ideas,  and  not 
to  whether  those  ideas  are  good  ideas.  There  are  many  situations,  however, 
in  which  the  quality  of  the  research  idea  is  of  paramount  importance.  For 
example,  when  submitting  a  research  proposal  as  part  of  a  grant  applica¬ 
tion,  the  quality  of  the  research  idea  is  an  important  consideration  in  the 
funding  decision.  Although  judging  whether  a  research  idea  is  good  may 
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Theory 

A  theory  is  a  conceptualization,  or 
description,  of  a  phenomenon  that 
attempts  to  integrate  all  that  we 
know  about  the  phenomenon  into 
a  concise  statement  or  question. 
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appear  to  be  somewhat  subjective,  there  are  some  generally  accepted  cri¬ 
teria  that  can  help  in  this  determination.  Is  the  research  idea  creative?  Will 
the  results  of  the  research  study  make  a  valuable  and  significant  contribu¬ 
tion  to  the  literature  or  practice  in  a  particular  field?  Does  the  research 
study  address  a  question  that  is  considered  important  in  the  field?  Ques¬ 
tions  like  these  can  often  be  answered  by  looking  through  the  existing  lit¬ 
erature  to  see  how  the  particular  research  study  fits  into  the  bigger  picture. 
So,  let’s  turn  our  attention  to  the  logical  next  step  in  the  planning  phase  of 
a  research  study:  the  literature  review. 

LITERATURE  REVIEW 

Once  a  researcher  has  chosen  a  specific  topic,  the  next  step  in  the  planning 
phase  of  a  research  study  is  reviewing  the  existing  literature  in  that  topic 
area.  If  you  are  not  yet  familiar  with  the  process  of  conducting  a  literature 
review,  it  simply  means  becoming  familiar  with  the  existing  literature  (e.g., 
books,  journal  articles)  on  a  particular  topic.  Obviously,  the  amount  of 
available  literature  can  differ  significantly  depending  on  the  topic  area  be¬ 
ing  studied,  and  it  can  certainly  be  a  time-consuming,  arduous,  and  diffi¬ 
cult  process  if  there  has  been  a  great  deal  of  research  conducted  in  a  par¬ 
ticular  area.  Ask  any  researcher  (or  research  assistant)  about  conducting 
literature  reviews  and  you  will  likely  encounter  similar  comments  about 
the  length  of  time  that  is  spent  looking  for  literature  on  a  particular  topic. 

Fortunately,  the  development  of  comprehensive  electronic  databases 
has  facilitated  the  process  of  conducting  literature  reviews.  In  the  past  few 
years,  individual  electronic  databases  have  been  developed  for  several  spe¬ 
cific  fields  of  study.  For  example,  medical  researchers  can  access  existing 
medical  literature  through  Medline;  social  scientists  can  use  PsychINFO 
(see  Rapid  Reference  2.2)  or  PsychLIT;  and  legal  researchers  can  use  West- 
law  or  Lexis.  Access  to  most  of  these  electronic  database  services  is  re¬ 
stricted  to  individuals  with  subscriptions  or  to  those  who  are  affiliated 
with  university-based  library  systems.  Although  gaining  access  to  these 
services  can  be  expensive,  the  advent  of  these  electronic  databases  has 
made  the  process  of  conducting  thorough  literature  reviews  much  easier 
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and  more  efficient.  No  longer  are 
researchers  (or  their  student  assis¬ 
tants!)  forced  to  look  through 
shelf  after  shelf  of  dusty  scientific 
journals. 

The  importance  and  value  of  a 
well-conducted  and  thorough  lit¬ 
erature  review  cannot  be  over¬ 
stated  in  the  context  of  planning  a 
research  study  (see  Christensen, 

2001).  The  primary  purpose  of  a 
literature  review  is  to  help  re¬ 
searchers  become  familiar  with 
the  work  that  has  already  been 
conducted  in  their  selected  topic 
areas.  For  example,  if  a  researcher  decides  to  investigate  the  onset  of  dia¬ 
betes  among  the  elderly,  it  would  be  important  for  him  or  her  to  have  an 
understanding  of  the  current  state  of  the  knowledge  in  that  area. 

Literature  reviews  are  absolutely  indispensable  when  planning  a  re¬ 
search  study  because  they  can  help  guide  the  researcher  in  an  appropriate 
direction  by  answering  several  questions  related  to  the  topic  area.  Have 
other  researchers  done  any  work  in  this  topic  area?  What  do  the  results  of 
their  studies  suggest?  Did  previous  researchers  encounter  any  unforeseen 
methodological  difficulties  of  which  future  researchers  should  be  aware 
when  planning  or  conducting  studies?  Does  more  research  need  to  be 
conducted  on  this  topic,  and  if  so,  in  what  specific  areas?  A  thorough  lit¬ 
erature  review  should  answer  these  and  related  questions,  thereby  helping 
to  set  the  stage  for  the  research  being  planned. 

Often,  the  results  of  a  well-conducted  literature  review  will  reveal  that 
the  study  being  planned  has,  in  fact,  already  been  conducted.  This  would 
obviously  be  important  to  know  during  the  planning  phase  of  a  study,  and 
it  would  certainly  be  beneficial  to  be  aware  of  this  fact  sooner  rather  than 
later.  Other  times,  researchers  may  change  the  focus  or  methodology  of 
their  studies  based  on  the  types  of  studies  that  have  already  been  con- 


— fiap/d Reference  22 


PsychINFO 

PsychINFO  is  an  electronic  biblio¬ 
graphic  database  that  provides  ab¬ 
stracts  and  citations  to  the  schol¬ 
arly  literature  in  the  behavioral 
sciences  and  mental  health.  Psych¬ 
INFO  includes  references  to  jour¬ 
nal  articles,  books,  dissertations, 
and  university  and  government  re¬ 
ports. The  database  contains  more 
than  1.9  million  references  dating 
from  1 840  to  the  present,  and  is 
updated  weekly. 
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DON'T  FORGET 


Literature  Reviews 

Scouring  the  existing  literature  to  get  ideas  for  future  research  is  a  tech¬ 
nique  used  by  most  researchers.  It  is  important  to  note,  however;  that  be¬ 
ing  familiar  with  the  literature  in  a  particular  topic  area  also  serves  an¬ 
other  purpose.  Specifically,  it  is  crucial  for  researchers  to  know  what  types 
of  studies  have  been  conducted  in  particular  areas  so  they  can  determine 
whether  their  specific  research  questions  have  already  been  answered. To 
be  clear;  it  is  certainly  a  legitimate  goal  of  research  to  replicate  the  results 
of  other  studies — but  there  is  a  difference  between  replicating  a  study  for 
purposes  of  establishing  the  robustness  orgeneralizability  of  the  original 
findings  and  simply  duplicating  a  study  without  having  any  knowledge  that 
the  same  study  has  already  been  conducted.  You  can  often  save  yourself  a 
good  deal  of  time  and  money  by  simply  looking  to  the  literature  to  see 
whether  the  study  you  are  planning  has  already  been  conducted. 


ducted.  Literature  reviews  can  often  be  intimidating  for  novice  re¬ 
searchers,  but  like  most  other  things  relating  to  research,  they  become  eas¬ 
ier  as  you  gain  experience. 


FORMULATING  A  RESEARCH  PROBLEM 

After  selecting  a  specific  research  topic  and  conducting  a  thorough  litera¬ 
ture  review,  you  are  ready  to  take  the  next  step  in  planning  a  research  study: 
clearly  articulating  the  research  problem.  The  research  problem  (see  Rapid 
Reference  2.3)  typically  takes  the  form  of  a  concise  question  regarding  the 
relationship  between  two  or  more  variables.  Examples  of  research  prob¬ 
lems  include  the  following:  (1)  Is  the  onset  of  depression  among  elderly 
males  related  to  the  development  of  physical  limitations?  (2)  What  effect 
does  a  sudden  dip  in  the  Dow  Jones  Industrial  Average  have  on  the  econ¬ 
omy  of  small  businesses?  (3)  Will  a  high- fiber,  low-fat  diet  be  effective  in 
reducing  cholesterol  levels  among  middle-aged  females?  (4)  Can  a  mem¬ 
ory  enhancement  class  improve  the  memory  functioning  of  patients  with 
progressive  dementia? 
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When  articulating  a  research 
question,  it  is  critically  important 
to  make  sure  that  the  question  is 
specific  enough  to  avoid  confu¬ 
sion  and  to  indicate  clearly  what  is 
being  studied.  In  other  words,  the 
research  problem  should  be  com¬ 
posed  of  a  precisely  stated  re¬ 
search  question  that  clearly  identi¬ 
fies  the  variables  being  studied.  A 
vague  research  question  often  re¬ 
sults  in  methodological  confu¬ 
sion,  because  the  research  ques¬ 
tion  does  not  clearly  indicate  what 
or  who  is  being  studied.  The  fol¬ 
lowing  are  some  examples  of 
vague  and  nonspecific  research  questions:  (1)  What  effect  does  weather 
have  on  memory?  (2)  Does  exercise  improve  physical  and  mental  health? 
(3)  Does  taking  street  drugs  result  in  criminal  behavior?  As  you  can  see, 
each  of  these  questions  is  rather  vague,  and  it  is  impossible  to  determine 
exacdy  what  is  being  studied.  For  example,  in  the  first  question,  what  type 
of  weather  is  being  studied,  and  memory  for  what?  In  the  second  question,  is 
the  researcher  studying  all  types  of  exercise,  and  the  effects  of  exercise  on 
the  physical  and  mental  health  of  all  people  or  a  specific  subgroup  of 
people?  Finally,  in  the  third  question,  which  street  drugs  are  being  studied, 
and  what  specific  types  of  criminal  behavior? 

An  effective  way  to  avoid  confusion  in  formulating  research  questions 
is  by  using  operational  definitions.  Through  the  use  of  operational  defini¬ 
tions,  researchers  can  specifically  and  clearly  identify  what  (or  who)  is 
being  studied  (see  Kazdin,  1992).  As  briefly  discussed  in  Chapter  1,  re¬ 
searchers  use  operational  definitions  to  define  key  concepts  and  terms  in 
the  specific  contexts  of  their  research  studies.  The  benefit  of  using  opera¬ 
tional  definitions  is  that  they  help  to  ensure  that  everyone  is  talking  about 
the  same  phenomenon.  Among  other  things,  this  will  gready  assist  future 


— flap/d  Reference  2.J 


Criteria  for 
Research  Problems 

Good  research  problems  must 
meet  three  criteria  (see  Kerlinger, 

1 973).  First,  the  research  problem 
should  describe  the  relationship 
between  two  or  more  variables. 
Second,  the  research  problem 
should  take  the  form  of  a  ques¬ 
tion. Third,  the  research  problem 
must  be  capable  of  being  tested 
empirically  (i.e.,  with  data  derived 
from  direct  observation  and  ex¬ 
perimentation). 
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researchers  who  attempt  to  replicate  a  given  study’s  results.  Obviously,  if 
researchers  cannot  determine  what  or  whom  is  being  studied,  they  will 
certainly  not  be  able  to  replicate  the  study.  Let’s  look  at  an  example  of  how 
operational  definitions  can  be  effectively  used  when  formulating  a  re¬ 
search  question. 

Let’s  say  that  a  researcher  is  interested  in  studying  the  effects  of  large 
class  sizes  on  the  academic  performance  of  gifted  children  in  high- 
population  schools.  The  research  question  may  be  phrased  in  the  follow¬ 
ing  manner:  “What  effects  do  large  class  sizes  have  on  the  academic  per¬ 
formance  of  gifted  children  in  high-population  schools?”  This  may  seem 
to  be  a  fairly  straightforward  research  question,  but  upon  closer  examina¬ 
tion,  it  should  become  evident  that  there  are  several  important  terms  and 
concepts  that  need  to  be  defined.  For  example,  what  constitutes  a  “large 
class”;  what  does  “academic  performance”  refer  to;  which  kids  are  con¬ 
sidered  “gifted”;  and  what  is  meant  by  “high-population  schools”? 

To  reduce  confusion,  the  terms  and  concepts  included  in  the  research 
question  need  to  be  clarified  through  the  use  of  operational  definitions. 
For  example,  “large  classes”  may  be  defined  as  classes  with  30  or  more  stu¬ 
dents;  “academic  performance”  may  be  limited  to  scores  received  on  stan¬ 
dardized  achievement  tests;  “gifted”  children  may  include  only  those  chil- 


DOIT’T  FORGET 


Operational  Definitions 

An  important  point  to  keep  in  mind  is  that  an  operational  definition  is 
specific  to  the  particular  study  in  which  it  is  used.  Although  researchers 
can  certainly  use  the  same  operational  definitions  in  different  studies 
(which  facilitates  replication  of  the  study  results),  different  studies  can  op¬ 
erationally  define  the  same  terms  and  concepts  in  different  ways.  For  ex¬ 
ample,  in  one  study,  a  researcher  may  define  “gifted  children”  as  those 
children  who  are  in  advanced  classes.  In  another  study,  however,  “gifted 
children”  may  be  defined  as  children  with  IQs  of  I  30  or  higher.There  is 
no  one  correct  definition  of'gifted  children,”  but  providing  an  operational 
definition  reduces  confusion  by  specifying  what  is  being  studied. 
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dren  who  are  in  advanced  classes;  and  “high-population  schools”  may  be 
defined  as  schools  with  more  than  1,000  students.  Without  operationally 
defining  these  key  terms  and  concepts,  it  would  be  difficult  to  determine 
what  exactly  is  being  studied.  Further,  the  specificity  of  the  operational  de¬ 
finitions  will  allow  future  researchers  to  replicate  the  research  study. 

ARTICULATING  HYPOTHESES 

The  next  step  in  planning  a  research  study  is  articulating  the  hypotheses 
that  will  be  tested.  This  is  yet  another  step  in  the  planning  phase  of  a 
research  study  that  can  be  somewhat  intimidating  for  inexperienced  re¬ 
searchers.  Articulating  hypotheses  is  truly  one  of  the  most  important  steps 
in  the  research  planning  process,  because  poorly  articulated  hypotheses 
can  ruin  what  may  have  been  an  otherwise  good  study.  The  following  dis¬ 
cussion  regarding  hypotheses  can  get  rather  complicated,  so  we  will  at¬ 
tempt  to  keep  the  discussion  relatively  short  and  to  the  point. 

As  briefly  discussed  in  Chapter  1,  hypotheses  attempt  to  explain,  predict, 
and  explore  the  phenomenon  of  interest.  In  many  types  of  studies,  this 
means  that  hypotheses  attempt  to  explain,  predict,  and  explore  the  rela¬ 
tionship  between  two  or  more  variables  (Kazdin,  1992;  see  Christensen, 
2001).  To  this  end,  hypotheses  can  be  thought  of  as  the  researcher’s  edu¬ 
cated  guess  about  how  the  study  will  turn  out.  As  such,  the  hypotheses 
articulated  in  a  particular  study  should  logically  stem  from  the  research 
problem  being  investigated. 

Before  we  discuss  specific  types  of  hypotheses,  there  are  two  important 
points  that  you  should  keep  in  mind.  First,  all  hypotheses  must  be  falsifi- 
able.  That  is,  hypotheses  must  be  capable  of  being  refuted  based  on  the  re¬ 
sults  of  the  study  (Christensen,  2001).  This  point  cannot  be  emphasized 
enough.  Put  simply,  if  a  researcher’s  hypothesis  cannot  be  refuted,  then  the 
researcher  is  not  conducting  a  scientific  investigation.  Articulating  hy¬ 
potheses  that  are  not  falsifiable  is  one  sure  way  to  ruin  what  could  have 
otherwise  been  a  well-conducted  and  important  research  study.  Second,  as 
briefly  discussed  in  Chapter  1,  a  hypothesis  must  make  a  prediction  (usually 
about  the  relationship  between  two  or  more  variables).  The  predictions 
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embodied  in  hypotheses  are  subsequendy  tested  empirically  by  gathering 
and  analyzing  data,  and  the  hypotheses  can  then  be  either  supported  or 
refuted. 

Now  that  you  have  been  introduced  to  the  topic  of  hypotheses,  we 
should  turn  our  attention  to  specific  types  of  hypotheses.  There  are  two 
broad  categories  of  hypotheses  with  which  you  should  be  familiar. 


Null  Hypotheses  and  Alternate  Hypotheses 

The  first  category  of  research  hypotheses,  which  was  briefly  discussed  in 
Chapter  1,  includes  the  null  hypothesis  and  the  alternate  (or  experimental)  hy¬ 
pothesis.  In  research  studies  involving  two  groups  of  participants  (e.g.,  ex¬ 
perimental  group  vs.  control  group),  the  null  hypothesis  always  predicts 
that  there  will  be  no  differences  between  the  groups  being  studied 
(Kazdin,  1992).  If,  however,  a  particular  research  study  does  not  involve 
groups  of  study  participants,  but  instead  involves  only  an  examination  of 
selected  variables,  the  null  hypothesis  predicts  that  there  will  be  no  rela¬ 
tionship  between  the  variables  being  studied.  By  contrast,  the  alternate 
hypothesis  always  predicts  that  there  will  be  a  difference  between  the 
groups  being  studied  (or  a  relationship  between  the  variables  being  stud¬ 
ied). 

Let’s  look  at  an  example  to  clarify  the  distinction  between  null  hy¬ 
potheses  and  alternate  hypotheses.  In  a  research  study  investigating  the  ef¬ 
fects  of  a  newly  developed  medication  on  blood  pressure  levels,  the  null 
hypothesis  would  predict  that  there  will  be  no  difference  in  terms  of  blood 
pressure  levels  between  the  group  that  receives  the  medication  (i.e.,  the 
experimental  group)  and  the  group  that  does  not  receive  the  medication 
(i.e.,  the  control  group).  By  contrast,  the  alternate  hypothesis  would  pre¬ 
dict  that  there  will  be  a  difference  between  the  two  groups  with  respect  to 
blood  pressure  levels.  So,  for  example,  the  alternate  hypothesis  may  pre¬ 
dict  that  the  group  that  receives  the  new  medication  will  experience  a 
greater  reduction  in  blood  pressure  levels  than  the  group  that  does  not  re¬ 
ceive  the  new  medication. 

It  is  not  uncommon  for  research  studies  to  include  several  null  and  al- 
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ternate  hypotheses.  The  number  of  null  and  alternate  hypotheses  included 
in  a  particular  research  study  depends  on  the  scope  and  complexity  of  the 
study  and  the  specific  questions  being  asked  by  the  researcher.  It  is  im¬ 
portant  to  keep  in  mind  that  the  number  of  hypotheses  being  tested  has 
implications  for  the  number  of  research  participants  that  will  be  needed  to 
conduct  the  study.  This  last  point  rests  on  rather  complex  statistical  con¬ 
cepts  that  we  will  not  discuss  in  this  section.  For  our  purposes,  it  is  suffi¬ 
cient  to  remember  that  as  the  number  of  hypotheses  increases,  the  num¬ 
ber  of  required  participants  also  typically  increases. 

In  scientific  research,  keep  in  mind  that  it  is  the  null  hypothesis  that  is 
tested,  and  then  the  null  hypothesis  is  either  confirmed  or  refuted  (sometimes 
phrased  as  rejected  or  not  rejected) .  Remember,  if  the  null  hypothesis  is  re- 
jected  (and  that  decision  is  based  on  the  results  of  statistical  analyses, 
which  will  be  discussed  in  later  chapters),  the  researcher  can  reasonably 
conclude  that  there  is  a  difference  between  the  groups  being  studied  (or  a 
relationship  between  the  variables  being  studied).  Rejecting  the  null  hy¬ 
pothesis  allows  a  researcher  to  not  reject  the  alternate  hypothesis,  and  not 
rejecting  a  hypothesis  is  the  most  we  can  do  in  scientific  research.  To  be 
clear,  we  can  never  accept  a  hypothesis;  we  can  only  fail  to  rejects  hypothesis 
(as  was  briefly  discussed  in  Chapter  1).  Accordingly,  researchers  typically 
seek  to  reject  the  null  hypothesis,  which  empirically  demonstrates  that  the 
groups  being  studied  differ  on  the  variables  being  examined  in  the  study. 
This  last  point  may  seem  counterintuitive,  but  it  is  an  extremely  important 
concept  that  you  should  keep  in  mind. 

Directional  Hypotheses  and  Nondirectional  Hypotheses 

The  second  category  of  research  hypotheses  includes  directional  hy¬ 
potheses  and  nondirectional  hypotheses.  In  research  studies  involving 
groups  of  study  participants,  the  decision  regarding  whether  to  use  a  di¬ 
rectional  or  a  nondirectional  hypothesis  is  based  on  whether  the  re¬ 
searcher  has  some  idea  about  how  the  groups  being  studied  will  differ. 
Specifically,  researchers  use  nondirectional  hypotheses  when  they  believe  that 
the  groups  will  differ,  but  they  do  not  have  a  belief  regarding  how  the 
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groups  will  differ  (i.e.,  in  which  direction  they  will  differ).  By  contrast,  re¬ 
searchers  use  directional  hypotheses  when  they  believe  that  the  groups  being 
studied  will  differ,  and  they  have  a  belief  regarding  how  the  groups  will  dif¬ 
fer  (i.e.,  in  a  particular  direction). 

A  simple  example  should  help  clarify  the  important  distinction  between 
directional  and  nondirectional  hypotheses.  Let’s  say  that  a  researcher  is 
using  a  standard  two-group  design  (i.e.,  one  experimental  group  and  one 
control  group)  to  investigate  the  effects  of  a  memory  enhancement  class 
on  college  students’  memories.  At  the  beginning  of  the  study,  all  of  the 
study  participants  are  randomly  assigned  to  one  of  the  two  groups.  (We 
will  talk  about  the  important  concept  of  random  assignment  later  in  this 
chapter  and  in  Chapter  3,  and  about  the  concept  of  informed  consent — 
which  we  mention  briefly  in  Rapid  Reference  2.4 — in  Chapter  8.)  Subse¬ 
quently,  one  group  (i.e.,  the  experimental  group)  will  be  exposed  to  the 
memory  enhancement  class  and  the  other  group  (i.e.,  the  control  group) 
will  not  be  exposed  to  the  memory  enhancement  class.  Afterward,  all  of 
the  participants  in  both  groups  will  be  administered  a  memory  test.  Based 
on  this  research  design,  any  observed  differences  between  the  two  groups 
on  the  memory  test  can  reasonably  be  attributed  to  the  effects  of  the 
memory  enhancement  class. 


Rapid  Reference  2  / 


Informed  Consent 

Priorto  your  collecting  any  data  from  study  participants,  the  participants 
must  voluntarily  agree  to  participate  in  the  study.Through  a  process  called 
informed  consent,  all  potential  study  participants  are  informed  about  the 
procedures  that  will  be  used  in  the  study,  the  risks  and  benefits  of  partici¬ 
pating  in  the  study,  and  their  rights  as  study  participants. There  are,  how¬ 
ever;  a  few  limited  instances  in  which  researchers  are  not  required  to  ob¬ 
tain  informed  consent  from  the  study  participants,  and  it  is  therefore 
important  that  researchers  become  knowledgeable  about  when  informed 
consent  is  required. The  topic  of  informed  consent  will  be  discussed  in  de¬ 
tail  in  Chapter  8. 
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Nondirectional  Hypotheses  vs.  Directional  Hypotheses 

A  reliable  way  to  tell  the  difference  between  directional  and  nondirec¬ 
tional  hypotheses  is  to  look  at  the  wording  of  the  hypotheses.  If  the  hy¬ 
pothesis  simply  predicts  that  there  will  be  a  difference  between  the  two 
groups,  then  it  is  a  nondirectional  hypothesis.  It  is  nondirectional  because 
it  predicts  that  there  will  be  a  difference  but  does  not  specify  how  the 
groups  will  differ.  If,  however;  the  hypothesis  uses  so-called  comparison 
terms,  such  as  “greater, "“less, ’"‘better,”  or  “worse,”  then  it  is  a  directional 
hypothesis.  It  is  directional  because  it  predicts  that  there  will  be  a  differ¬ 
ence  between  the  two  groups  and  it  specifies  how  the  two  groups  will 
differ. 


In  this  example,  the  researcher  has  several  options  in  terms  of  hy¬ 
potheses.  On  the  one  hand,  the  researcher  may  simply  hypothesize  that 
there  will  be  a  difference  between  the  two  groups  on  the  memory  test. 
This  would  be  an  example  of  a  nondirectional  hypothesis,  because  the  re¬ 
searcher  is  hypothesizing  that  the  two  groups  will  differ,  but  the  researcher 
is  not  specifying  how  the  two  groups  will  differ.  Alternatively,  the  re¬ 
searcher  could  hypothesize  that  the  participants  who  are  exposed  to  the 
memory  enhancement  class  will  perform  better  on  the  memory  test  than 
the  participants  who  are  not  exposed  to  the  memory  enhancement  class. 
This  would  be  an  example  of  a  directional  hypothesis,  because  the  re¬ 
searcher  is  hypothesizing  that  the  two  groups  will  differ  and  specifying  how 
the  two  groups  will  differ  (i.e.,  one  group  will  perform  better  than  the 
other  group  on  the  memory  test).  See  Rapid  Reference  2.5  for  a  tip  on  how 
to  distinguish  between  directional  and  nondirectional  hypotheses. 

CHOOSING  VARIABLES  TO  STUDY 

We  are  now  very  close  to  beginning  the  actual  study,  but  there  are  still  a  few 
things  remaining  to  do  before  we  begin  collecting  data.  Before  proceeding 
any  further,  it  would  probably  be  helpful  for  us  to  take  a  moment  and  see 
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where  we  are  in  this  process  of 
planning  a  research  study.  So  far, 
we  have  discussed  how  research¬ 
ers  (1)  come  up  with  researchable 
ideas;  (2)  conduct  thorough  litera¬ 
ture  reviews  to  see  what  has  been 
done  in  their  topic  areas  (and,  if 
necessary,  to  refine  the  focus  of 
their  studies  based  on  the  results 
of  the  prior  research);  (3)  formu¬ 
late  concise  research  problems 
with  clearly  defined  concepts  and 
terms  (using  operational  defini¬ 
tions);  and  (4)  articulate  falsifiable  hypotheses.  We  have  certainly  accom¬ 
plished  quite  a  bit,  but  there  is  still  a  litde  more  to  do  before  beginning  the 
study  itself. 

The  next  step  in  planning  a  research  study  is  identifying  what  variables 
(see  Rapid  Reference  2.6)  will  be  the  focus  of  the  study.  There  are  many 
categories  of  variables  that  can  appear  in  research  studies.  However,  rather 
than  discussing  every  conceivable  one,  we  will  focus  our  attention  on  the 
most  commonly  used  categories.  Although  not  every  research  study  will 
include  all  of  these  variables,  it  is  important  that  you  are  aware  of  the  dif¬ 
ferences  among  the  categories  and  when  each  type  of  variable  may  be 
used. 


— flap/d Reference  26 


Variables 

A  variable  is  anything  that  can  take 
on  different  values.  For  example, 
height,  weight,  age,  race,  attitude, 
and  IQ  are  variables  because  there 
are  different  heights,  weights,  ages, 
races,  attitudes,  and  IQs.  By  con¬ 
trast,  if  something  cannot  vary,  or 
take  on  different  values,  then  it  is 
referred  to  as  a  constant. 


Independent  Variables  vs.  Dependent  Variables 

When  discussing  variables,  perhaps  the  most  important  distinction  is  be¬ 
tween  independent  and  dependent  variables.  The  independent  variable  is  the 
factor  that  is  manipulated  or  controlled  by  the  researcher.  In  most  studies, 
researchers  are  interested  in  examining  the  effects  of  the  independent 
variable.  In  its  simplest  form,  the  independent  variable  has  two  levels:  pre¬ 
sent  or  absent.  For  example,  in  a  research  study  investigating  the  effects  of 
a  new  type  of  psychotherapy  on  symptoms  of  anxiety,  one  group  will  be 
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exposed  to  the  psychotherapy  and  one  group  will  not  be  exposed  to  the 
psychotherapy.  In  this  example,  the  independent  variable  is  the  psycho¬ 
therapy,  because  the  researcher  can  control  whether  the  study  participants 
are  exposed  to  it  and  the  researcher  is  interested  in  examining  the  effects 
of  the  psychotherapy  on  symptoms  of  anxiety.  As  you  may  already  know, 
the  group  in  which  the  independent  variable  is  present  (i.e.,  that  is  exposed 
to  the  psychotherapy)  is  referred  to  as  the  experimental  group,  whereas  the 
group  in  which  the  independent  variable  is  not  present  (i.e.,  that  is  not  ex¬ 
posed  to  the  psychotherapy)  is  referred  to  as  the  control  group. 

Although,  in  its  simplest  form,  an  independent  variable  has  only  two 
levels  (i.e.,  present  or  absent),  it  is  certainly  not  uncommon  for  an  inde¬ 
pendent  variable  to  have  more  than  two  levels.  For  example,  in  a  research 
study  examining  the  effects  of  a  new  medication  on  symptoms  of  depres¬ 
sion,  the  researcher  may  include  three  groups  in  the  study — one  control 
group  and  two  experimental  groups.  As  usual,  the  control  group  would 
not  get  the  medication  (or  would  get  a  placebo),  while  one  experimental 
group  may  get  a  lower  dose  of  the  medication  and  the  other  experimental 
group  may  get  a  higher  dose  of  the  medication.  In  this  example,  the  inde¬ 
pendent  variable  (i.e.,  medication)  consists  of  three  levels:  absent,  low,  and 
high.  Other  levels  of  independent  variables  are,  of  course,  also  possible, 
such  as  low,  medium,  and  high;  or  absent,  low,  medium,  and  high.  Re¬ 
searchers  make  decisions  regarding  the  number  of  levels  of  an  indepen¬ 
dent  variable  based  on  a  careful  consideration  of  several  factors,  including 
the  number  of  available  study  participants,  the  degree  of  specificity  of  re¬ 
sults  they  desire  to  achieve  with  the  study,  and  the  associated  financial 
costs. 

It  is  also  common  for  a  research  study  to  include  multiple  independent 
variables,  perhaps  with  each  of  the  independent  variables  consisting  of 
multiple  levels.  For  example,  a  researcher  may  attempt  to  investigate  the 
effects  of  both  medication  and  psychotherapy  on  symptoms  of  depres¬ 
sion.  In  this  example,  there  are  two  independent  variables  (i.e.,  medication 
and  psychotherapy),  and  each  independent  variable  could  potentially  con¬ 
sist  of  multiple  levels  (e.g.,  low,  medium,  and  high  doses  of  medication; 
cognitive  behavioral  therapy,  psychodynamic  therapy,  and  rational  emo- 
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tive  therapy).  As  you  can  see,  things  have  a  tendency  to  get  complicated 
fairly  quickly  when  researchers  use  multiple  independent  variables  with 
multiple  levels. 

At  this  point  in  the  discussion,  you  should  be  actively  resisting  the  urge 
to  be  intimidated  by  the  material  presented  so  far  in  this  chapter.  We  have 
covered  quite  a  bit  of  information,  and  it  is  getting  more  complicated  as 
we  go.  Keeping  track  of  the  different  categories  and  types  of  variables  can 
certainly  be  difficult,  even  for  those  of  us  with  considerable  research  ex¬ 
perience.  If  you  are  getting  confused,  it  may  be  helpful  to  reduce  things  to 
their  simplest  terms.  In  the  case  of  independent  variables,  the  important 
point  to  keep  in  mind  is  that  researchers  are  interested  in  examining  the  ef¬ 
fects  of  an  independent  variable  on  something,  and  that  something  is  the 
dependent  variable  (Isaac  &  Michael,  1997).  Let’s  now  turn  our  attention 
to  dependent  variables. 

The  dependent  variable  is  a  measure  of  the  effect  (if  any)  of  the  indepen¬ 
dent  variable.  For  example,  a  researcher  may  be  interested  in  examining 
the  effects  of  a  new  medication  on  symptoms  of  depression  among  col¬ 
lege  students.  In  this  example,  prior  to  administering  any  medication,  the 
researcher  would  most  likely  administer  a  valid  and  reliable  measure  of  de¬ 
pression — such  as  the  Beck  Depression  Inventory  (Beck,  Ward,  Mendel- 
son,  Mock,  &  Erbaugh,  1 961) — to  a  group  of  study  participants.  The  Beck 
Depression  Inventory  is  a  well-accepted  self-report  inventory  of  symp¬ 
toms  of  depression.  Administering  a  measure  of  depression  to  the  study 
participants  prior  to  administering  any  medication  allows  the  researcher  to 
obtain  what  is  called  a  baseline  measure  of  depression,  which  simply  means 
a  measurement  of  the  levels  of  depression  that  are  present  prior  to  the  ad¬ 
ministration  of  any  intervention  (e.g.,  psychotherapy,  medication).  The  re¬ 
searcher  then  randomly  assigns  the  study  participants  to  two  groups,  an 
experimental  group  that  receives  the  new  medication  and  a  control  group 
that  does  not  receive  the  new  medication  (perhaps  its  members  are  ad¬ 
ministered  a  placebo). 

After  administering  the  medication  (or  not  administering  the  medica¬ 
tion,  for  the  control  group),  the  researcher  would  then  readminister  the 
Beck  Depression  Inventory  to  all  of  the  participants  in  both  groups.  The 
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researcher  now  has  two  Beck  Depression  Inventory  scores  for  each  of  the 
participants  in  both  groups — one  score  from  before  the  medication  was 
administered  and  one  score  from  after  the  medication  was  administered. 
(By  the  way,  this  type  of  research  design  is  referred  to  as  a  pre/post  design, 
because  the  dependent  variable  is  measured  both  before  and  after  the  in¬ 
tervention  is  administered.  We  will  talk  about  this  type  of  research  design 
in  Chapter  5.)  These  two  depression  scores  can  then  be  compared  to  de¬ 
termine  whether  the  medication  had  any  effect  on  the  levels  of  depression. 
Specifically,  if  the  scores  on  the  Beck  Depression  Inventory  decrease 
(which  indicates  lower  levels  of  depression)  for  the  participants  in  the  ex¬ 
perimental  group,  but  not  for  the  participants  in  the  control  group,  then 
the  researcher  can  reasonably  conclude  that  the  medication  was  effective 
in  reducing  symptoms  of  depression.  To  be  more  precise,  for  the  re¬ 
searcher  to  conclude  that  the  medication  was  effective  in  reducing  symp¬ 
toms  of  depression,  there  would  need  to  be  a  statistically  significant  difference 
in  Beck  Depression  Inventory  scores  between  the  experimental  group  and 
the  control  group,  but  we  will  put  that  point  aside  for  the  moment. 

Before  proceeding  any  further,  take  a  moment  and  see  whether  you  can 
identify  the  independent  and  dependent  variables  in  our  example.  Have 
you  figured  it  out?  In  this  example,  the  new  medication  is  the  independent 
variable  because  it  is  under  the  researcher’s  control  and  the  researcher  is 
interested  in  measuring  its  effect.  The  Beck  Depression  Inventory  score  is 
the  dependent  variable  because  it  is  a  measure  of  the  effect  of  the  inde¬ 
pendent  variable. 

When  students  are  exposed  to  research  terminology  for  the  first  time, 
it  is  not  uncommon  for  them  to  confuse  the  independent  and  dependent 
variables.  Fortunately,  there  is  an  easy  way  to  remember  the  difference  be¬ 
tween  the  two.  If  you  get  confused,  think  of  the  independent  variable  as 
the  “cause”  and  the  dependent  variable  as  the  “effect.”  To  assist  you  in  this 
process,  it  may  be  helpful  if  you  practice  stating  your  research  question 

in  the  following  manner:  “What  are  the  effects  of _ on 

_ ?”  The  first  blank  is  the  independent  variable  and  the  second 

blank  is  the  dependent  variable.  For  example,  we  may  ask  the  following  re¬ 
search  question:  “What  are  the  effects  of  exercise  on  levels  of  body  fat?” 
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Independent  Variables  and  Dependent  Variables 

The  independent  variable  is  called  “independent”  because  it  is  indepen¬ 
dent  of  the  outcome  being  measured.  More  specifically,  the  independent 
variable  is  what  causes  or  influences  the  outcome. The  dependent  variable 
is  called  “dependent”  because  it  is  influenced  by  the  independent  variable. 
For  example,  in  our  hypothetical  study  examining  the  effects  of  medica¬ 
tion  on  symptoms  of  depression,  the  measure  of  depression  is  the  depen¬ 
dent  variable  because  it  is  influenced  by  (i.e.,  is  dependent  on)  the  inde¬ 
pendent  variable  (i.e.,  the  medication). 


In  this  example,  “exercise”  is  the  independent  variable  and  “levels  of  body 
fat”  is  the  dependent  variable.  Rapid  Reference  2.7  summarizes  the  dis¬ 
tinction  between  the  two;  and  Rapid  Reference  2.8  uses  this  distinction  to 
further  our  understanding  of  the  term  “research.” 

Now  that  we  know  the  differ¬ 
ence  between  independent  and 
dependent  variables,  we  should 
focus  our  attention  on  how  re¬ 
searchers  choose  these  variables 
for  inclusion  in  their  research 
studies.  An  important  point  to 
keep  in  mind  is  that  the  researcher 

pendent  variables  based  on  the  re¬ 
search  problem  and  the  hypothe¬ 
ses.  In  many  ways,  this  simplifies 
the  process  of  selecting  variables 
by  requiring  the  selection  of  inde¬ 
pendent  and  dependent  variables 
to  flow  logically  from  the  state¬ 
ment  of  the  research  problem  and 
the  hypotheses.  Once  the  research 


selects  the  independent  and  de- 


=  flap/d  Reference  22 


Definition  of  “Research” 

In  Chapter  I ,  we  briefly  defined 
research  as  an  examination  of  the 
relationship  between  two  or  more 
variables.  We  can  now  be  a  little 
more  specific  in  our  definition  of 
“research.”  Research  is  an  examina¬ 
tion  of  the  relationship  between 
one  or  more  independent  vari¬ 
ables  and  one  or  more  dependent 
variables.  In  even  more  precise 
terms,  we  can  define  research  as 
an  examination  of  the  effects  of 
one  or  more  independent  vari¬ 
ables  on  one  or  more  dependent 
variables. 
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problem  and  the  hypotheses  are  articulated,  it  should  not  take  too  much 
effort  to  identify  the  independent  and  dependent  variables. 

Perhaps  another  example  will  clarify  this  important  point.  Suppose  that 
a  researcher  is  interested  in  examining  the  relationship  between  intake  of 
dietary  liber  and  the  incidence  of  colon  cancer  among  elderly  males.  The 
research  problem  may  be  stated  in  the  following  manner:  “Does  increased 
consumption  of  dietary  fiber  result  in  a  decreased  incidence  of  colon  can¬ 
cer  among  elderly  males?”  Using  our  suggested  phrasing  from  the  previ¬ 
ous  paragraph,  we  could  also  ask  the  following  question:  “What  are  the 
effects  of  dietary  fiber  consumption  on  the  incidence  of  colon  cancer 
among  elderly  males?”  Following  logically  from  this  research  problem,  the 
researcher  may  hypothesize  the  following:  “High  levels  of  dietary  fiber 
consumption  will  decrease  the  incidence  of  colon  cancer  among  elderly 
males.”  Obviously,  several  terms  in  this  hypothesis  need  to  be  opera¬ 
tionally  defined,  but  we  can  skip  that  step  for  the  purposes  of  the  current 
example.  It  takes  only  a  cursory  examination  of  the  research  problem  and 
related  hypothesis  to  determine  the  independent  variable  and  dependent 
variable  for  this  study.  Have  you  figured  it  out  yet?  Because  the  researcher 
is  interested  in  examining  the  effects  of  consuming  dietary  fiber  on  the  in¬ 
cidence  of  colon  cancer,  “dietary  fiber  consumption”  is  the  independent 
variable  and  a  measure  of  the  “incidence  of  colon  cancer”  is  the  depen¬ 
dent  variable. 

Categorical  Variables  vs.  Continuous  Variables 

Now  that  you  are  familiar  with  the  difference  between  independent  and 
dependent  variables,  we  will  turn  our  attention  to  another  category  of  vari¬ 
ables  with  which  you  should  be  familiar.  The  distinction  between  categor¬ 
ical  variables  and  continuous  variables  frequently  arises  in  the  context  of 
many  research  studies.  Categorical  variables  are  variables  that  can  take  on 
specific  values  only  within  a  defined  range  of  values.  For  example,  “gen¬ 
der”  is  a  categorical  variable  because  you  can  either  be  male  or  female. 
There  is  no  middle  ground  when  it  comes  to  gender;  you  can  either  be 
male  or  female;  you  must  be  one,  and  you  cannot  be  both.  “Race,”  “mari- 
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Putting  It  Into  Practice 


Varying  Independent  Variables  and  Measuring 
Dependent  Variables 

Assuming  that  a  researcher  has  a  well-articulated  and  specific  hypothesis, 
it  is  a  fairly  straightforward  task  to  identify  the  independent  and  depen¬ 
dent  variables.  Often,  the  difficult  part  is  determining  howto  vary  the  in¬ 
dependent  variable  and  measure  the  dependent  variable.  For  example, 
let’s  say  that  a  researcher  is  interested  in  examining  the  effects  of  viewing 
television  violence  on  levels  of  prosocial  behavior.  In  this  example,  we  can 
easily  identify  the  independent  variable  as  viewing  television  violence  and 
the  dependent  variable  as  prosocial  behaviorThe  difficult  part  is  finding 
ways  to  vary  the  independent  variable  (how  can  the  researcher  vary  the 
viewing  of  television  violence?)  and  measure  the  dependent  variable  (how 
can  the  researcher  measure  prosocial  behavior?).  Finding  ways  to  vary  the 
independent  variable  and  measure  the  dependent  variable  often  requires 
as  much  creativity  as  scientific  know-how. 


tal  status,”  and  “hair  color”  are  other  common  examples  of  categorical 
variables.  Although  this  may  sound  obvious,  it  is  often  helpful  to  think 
of  categorical  variables  as  consisting  of  discrete,  mutually  exclusive  cate¬ 
gories,  such  as  “male/female,”  “White/Black,”  “single/married/di¬ 
vorced,”  and  “blonde/brunette/redhead.”  In  contrast  with  categorical 
variables,  continuous  variables  are  variables  that  can  theoretically  take  on  any 
value  along  a  continuum.  For  example,  “age”  is  a  continuous  variable  be¬ 
cause,  theoretically  at  least,  someone  can  be  any  age.  “Income,”  “weight,” 
and  “height”  are  other  examples  of  continuous  variables.  As  we  will  see, 
the  type  of  data  produced  from  using  categorical  variables  differs  from  the 
type  of  data  produced  from  using  continuous  variables. 

In  some  circumstances,  researchers  may  decide  to  convert  some  con¬ 
tinuous  variables  into  categorical  variables.  For  example,  rather  than  using 
“age”  as  a  continuous  variable,  a  researcher  may  decide  to  make  it  a  cate¬ 
gorical  variable  by  creating  discrete  categories  of  age,  such  as  “under  age 
40”  or  “age  40  or  older.”  “Income,”  which  is  often  treated  as  a  continuous 
variable,  may  instead  be  treated  as  a  categorical  variable  by  creating  dis- 
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Categorical  Variables  vs.  Continuous  Variables 

The  decision  of  whether  to  use  categorical  or  continuous  variables  will 
have  an  effect  on  the  precision  of  the  data  that  are  obtained.  When  com¬ 
pared  with  categorical  variables,  continuous  variables  can  be  measured 
with  a  greater  degree  of  precision.  In  addition,  the  choice  of  which  statisti¬ 
cal  tests  will  be  used  to  analyze  the  data  is  partially  dependent  on 
whether  the  researcher  uses  categorical  or  continuous  variables.  Certain 
statistical  tests  are  appropriate  for  categorical  variables,  while  other  statis¬ 
tical  tests  are  appropriate  for  continuous  variables.  As  with  many  deci¬ 
sions  in  the  research-planning  process,  the  choice  of  which  type  of  vari¬ 
able  to  use  is  partially  dependent  on  the  question  that  the  researcher  is 
attempting  to  answer. 

crete  categories  of  income,  such  as  “under  $25,000  per  year,”  “$25,000— 
$50,000  per  year,”  and  “over  $50,000  per  year.”  The  benefit  of  using  con¬ 
tinuous  variables  is  that  they  can  be  measured  with  a  higher  degree  of  pre¬ 
cision.  For  example,  it  is  more  informative  to  record  someone’s  age  as  “47 
years  old”  (continuous)  as  opposed  to  “age  40  or  older”  (categorical) .  The 
use  of  continuous  variables  gives  the  researcher  access  to  more  specific 
data.  See  Rapid  Reference  2.9. 


Quantitative  Variables  vs.  Qualitative  Variables 

Finally,  before  moving  on  to  a  different  topic,  it  would  behoove  us  to 
briefly  discuss  the  distinction  between  qualitative  variables  and  quantita¬ 
tive  variables.  Qualitative  variables  are  variables  that  vary  in  kind,  while  quan¬ 
titative  variables  are  those  that  vary  in  amount  (see  Christensen,  2001).  This 
is  an  important  yet  subtle  distinction  that  frequently  arises  in  research 
studies,  so  let’s  take  a  look  at  a  few  examples. 

Rating  something  as  “attractive”  or  “not  attractive,”  “helpful”  or  “not 
helpful,”  or  “consistent”  or  “not  consistent”  are  examples  of  qualitative 
variables.  In  these  examples,  the  variables  are  considered  qualitative  be¬ 
cause  they  vary  in  kind  (and  not  amount).  For  example,  the  thing  being 
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rated  is  either  “attractive”  or  “not  attractive,”  but  there  is  no  indication  of 
the  level  (or  amount)  of  attractiveness.  By  contrast,  reporting  the  number 
of  times  that  something  happened  or  the  number  of  times  that  someone 
engaged  in  a  particular  behavior  are  examples  of  quantitative  variables. 
These  variables  are  considered  quantitative  because  they  provide  infor¬ 
mation  regarding  the  amount  of  something. 

As  stated  at  the  beginning  of  this  section,  there  are  several  other  cate¬ 
gories  of  variables  that  we  will  not  be  discussing  in  this  text.  What  we  have 
covered  in  this  section  are  the  major  categories  that  most  commonly  ap¬ 
pear  in  research  studies.  One  final  comment  is  necessary.  It  is  important  to 
keep  in  mind  that  a  single  variable  may  fit  into  several  of  the  categories  that 
we  have  discussed.  For  example,  the  variable  “height”  is  both  continuous 
(if  measured  along  a  continuum)  and  quantitative  (because  we  are  getting 
information  regarding  the  amount  of  height) .  Along  similar  lines,  the  vari¬ 
able  “eye  color”  is  both  categorical  (because  there  is  a  limited  number  of 
discrete  categories  of  eye  color)  and  qualitative  (because  eye  color  varies 
in  kind,  not  amount) . 

If  this  discussion  of  variables  still  seems  confusing  to  you,  take  comfort 
in  the  fact  that  even  seasoned  researchers  can  still  get  turned  around  on 
these  issues.  As  with  most  aspects  of  research,  repeated  exposure  to  (and 
experience  with)  these  concepts  tends  to  breed  a  comfortable  level  of  fa¬ 
miliarity.  So,  the  next  time  you  come  across  a  research  study,  practice  iden- 
tifying  the  different  types  of  variables  that  we  have  discussed  in  this  section. 

RESEARCH  PARTICIPANTS 

Selecting  participants  is  one  of  the  most  important  aspects  of  planning 
and  designing  a  research  study.  For  reasons  that  should  become  clear  as 
you  read  this  section,  selecting  research  participants  is  often  more  difficult 
and  more  complicated  than  it  may  initially  appear.  In  addition  to  needing 
the  appropriate  number  of  participants  (which  may  be  rather  difficult  in 
large-scale  studies  that  require  many  participants),  researchers  need  to 
have  the  appropriate  kinds  of  participants  (which  may  be  difficult  when  re¬ 
sources  are  limited  or  the  pool  of  potential  participants  is  small).  More- 
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over,  the  manner  in  which  individuals  are  selected  to  participate,  and  the 
way  those  participants  are  subsequendy  assigned  to  groups  within  the 
study,  has  a  dramatic  effect  on  the  types  of  conclusions  that  can  be  drawn 
from  the  research  study. 

At  the  outset,  it  is  important  to  note  that  not  all  types  of  research  stud¬ 
ies  involve  human  participants.  For  example,  the  research  studies  carried 
out  in  many  fields  of  science,  such  as  physics,  biology,  chemistry,  and 
botany,  generally  do  not  involve  human  participants.  For  the  research  sci¬ 
entists  in  these  fields,  the  unit  of  study  may  be  an  atom,  a  cell,  a  molecule, 
or  a  flower,  but  not  a  human  participant.  However,  for  those  researchers 
who  are  involved  in  other  types  of  research,  such  as  social  science  research, 
the  majority  of  their  studies  will  involve  human  participants  in  some  ca¬ 
pacity.  Therefore,  it  is  important  that  you  become  familiar  with  the  proce¬ 
dures  that  are  commonly  employed  by  researchers  to  select  an  appropriate 
group  of  study  participants  and  assign  those  participants  to  groups  within 
the  study.  This  section  will  address  these  two  important  tasks. 

Before  proceeding  any  further,  it  is  worth  noting  that  when  a  researcher 
is  planning  a  study,  he  or  she  must  choose  an  appropriate  research  design 
prior  to  selecting  study  participants  and  assigning  them  to  groups.  In  fact, 
the  specific  research  design  used  in  a  study  often  determines  how  the  par¬ 
ticipants  will  be  selected  for  inclusion  in  the  study  and  how  they  will  be  as¬ 
signed  to  groups  within  it.  However,  because  the  topic  of  choosing  an  ap¬ 
propriate  research  design  requires  an  extensive  and  detailed  discussion,  we 
have  set  aside  an  entire  chapter  to  cover  that  topic  (see  Chapter  5).  There¬ 
fore,  when  reading  this  section,  it  is  important  to  keep  in  mind  that  the 
tasks  of  selecting  participants  and  assigning  those  participants  to  groups 
typically  take  place  after  you  have  chosen  an  appropriate  research  design. 
Accordingly,  you  may  want  to  reread  this  section  after  you  have  read  the 
chapter  on  research  designs  (Chapter  5). 

Selecting  Study  Participants 

For  those  research  studies  that  involve  human  participants,  the  selection  of 
the  study  participants  is  of  the  utmost  importance.  There  are  several  ways 
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in  which  potential  participants  can  be  selected  for  inclusion  in  a  research 
study,  and  the  manner  in  which  participants  are  selected  is  determined  by 
several  factors,  including  the  research  question  being  investigated,  the  re¬ 
search  design  being  used,  and  the  availability  of  appropriate  numbers  and 
types  of  study  participants.  In  this  section,  we  will  discuss  the  most  com¬ 
mon  methods  used  by  researchers  for  selecting  study  participants. 

For  some  types  of  research  studies,  specific  research  participants  (or 
groups  of  research  participants)  may  be  sought  out.  For  example,  in  a  qual¬ 
itative  study  investigating  the  combat  experiences  of  World  War  II  veter¬ 
ans,  the  researcher  may  simply  approach  identified  World  War  II  veterans 
and  ask  them  to  participate  in  the  study.  Another  example  would  be  an  in¬ 
vestigation  of  the  effects  of  a  Head  Start  program  among  preschool  stu¬ 
dents.  In  this  situation,  the  researcher  may  decide  to  study  an  already  ex¬ 
isting  preschool  class.  The  researcher  could  randomly  select  preschool 
students  to  participate  in  the  study,  but  would  probably  save  both  time  and 
money  by  using  a  preexisting  group  of  students. 

As  you  can  probably  imagine,  there  are  some  difficulties  that  arise  when 
researchers  use  preexisting  groups  or  target  specific  people  for  inclusion 
in  a  research  study.  The  primary  difficulty  is  that  the  study  results  may  not 
be  generalizable  to  other  groups  or  other  individuals  (i.e.,  groups  or  indi¬ 
viduals  not  in  the  study).  For  example,  if  a  researcher  is  interested  in  draw¬ 
ing  broad  conclusions  about  the  effects  of  a  Head  Start  program  on 
preschool  students  in  general,  the  researcher  would  not  want  to  limit  par¬ 
ticipation  in  the  study  to  one  specific  group  of  preschool  students  from 
one  specific  preschool.  For  the  results  of  the  study  to  generalize  beyond 
the  sample  used  in  the  study,  the  sample  of  preschool  students  in  the  study 
would  have  to  be  representative  of  the  entire  population  of  preschool  stu¬ 
dents. 

We  have  introduced  quite  a  few  new  terms  and  concepts  in  this  discus¬ 
sion,  so  we  need  to  make  sure  that  we  are  all  on  the  same  page  before  we 
proceed  any  further.  Let’s  start  with  generali^ability.  The  concept  of  gener- 
alizability  will  be  covered  in  detail  in  future  chapters,  so  we  will  not  spend 
too  much  time  on  it  here.  But  we  do  need  to  take  a  moment  and  briefly  dis¬ 
cuss  what  we  mean  when  we  say  that  the  results  of  a  study  are  (or  are  not) 
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generalizable.  To  make  this  discussion  more  digestible,  let’s  look  at  a  brief 
example. 

Suppose  that  a  researcher  is  interested  in  examining  the  employment 
rate  among  recent  college  graduates.  To  examine  this  issue,  the  researcher 
collects  employment  data  on  1000  recent  graduates  from  ABC  University. 
After  looking  at  the  data  and  conducting  some  simple  calculations,  the 
researcher  determines  that  97.5%  of  the  recent  ABC  graduates  obtained 
full-time  employment  within  6  months  of  graduation.  Based  on  the  results 
of  this  study,  can  the  researcher  reasonably  conclude  that  the  employment 
rate  for  all  recent  college  graduates  across  the  United  States  is  97.5%?  Ob¬ 
viously  not.  But  why?  The  most  obvious  reason  is  that  the  recent  gradu¬ 
ates  from  ABC  University  may  not  be  representative  of  recent  graduates 
from  other  colleges.  Perhaps  recent  ABC  graduates  have  more  success  in 
obtaining  employment  than  recent  graduates  from  smaller,  lesser-known 
colleges.  As  a  result,  there  is  likely  a  great  degree  of  variability  in  the  em¬ 
ployment  rates  of  recent  college  graduates  across  the  United  States. 
Therefore,  it  would  be  misleading  and  inaccurate  to  reach  a  broad  conclu¬ 
sion  about  the  employability  of  all  recent  college  graduates  based  exclu¬ 
sively  on  the  employment  experiences  of  recent  ABC  graduates. 

In  the  previous  example,  the  only  reasonable  conclusion  that  the  re¬ 
searcher  can  reach  is  that  97.5%  of  the  recent  ABC  graduates  in  that  partic¬ 
ular  study  obtained  full-time  employment  within  6  months  of  graduation. 
This  limited  conclusion  would  likely  be  of  little  interest  to  students  outside 
ABC  University  because  the  results  of  the  study  have  no  implications 
for  those  other  students.  For  the  results  of  this  study  to  be  generalizable 
(i.e.,  applicable  to  recent  graduates  from  all  colleges,  not  just  ABC)  the 
researcher  would  need  to  examine  the  employment  rates  for  recent  grad¬ 
uates  from  many  different  colleges.  This  would  have  the  effect  of  ensuring 
that  the  sample  of  participants  is  representative  of  all  recent  college  grad¬ 
uates.  Obviously,  it  would  be  most  informative  and  accurate  if  the  re¬ 
searcher  were  able  to  examine  the  employment  rates  for  all  recent  gradu¬ 
ates  from  all  colleges.  Then,  rather  than  having  to  make  an  inference  about 
the  employment  rate  in  the  population  based  on  the  results  of  the  study, 
the  researcher  would  have  an  exact  employment  rate. 
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For  obvious  reasons,  however,  it  is  typically  not  practical  to  include 
every  member  of  the  population  of  interest  (e.g.,  all  recent  college  gradu¬ 
ates)  in  a  research  study.  Time,  money,  and  resources  are  three  limiting 
factors  that  make  this  unlikely.  Therefore,  most  researchers  are  forced 
to  study  a  representative  subset — a  sample — of  the  population  of  interest. 
Accordingly,  in  our  example,  the  researcher  would  be  forced  to  study  a 
sample  of  recent  college  graduates  from  the  population  of  all  recent  col¬ 
lege  graduates.  (If  you  need  a  brief  refresher  on  the  distinction  between  a 
sample  and  a  population,  see  Chapter  1 .)  If  the  sample  used  in  the  study  is 
representative  of  the  population  from  which  it  was  drawn,  the  researcher 
can  draw  conclusions  about  the  population  based  on  the  results  obtained 
with  the  sample.  In  other  words,  using  a  representative  sample  is  what  al¬ 
lows  researchers  to  reach  broad  conclusions  applicable  to  the  entire  pop¬ 
ulation  of  interest  based  on  the  results  obtained  in  their  specific  studies. 
For  those  of  you  who  are  still  confused  about  the  concept  of  generaliz- 
ability,  do  not  fret,  because  we  revisit  this  issue  in  later  chapters. 

The  discussion  up  to  this  point  should  lead  you  to  an  obvious  question. 
Specifically,  if  choosing  a  representative  sample  is  so  important  for  the 
purposes  of  generalizing  the  results  of  a  study,  how  do  researchers  go 
about  selecting  a  representative  sample  from  the  population  of  interest? 
The  primary  procedure  used  by  researchers  to  choose  a  representative 
sample  is  called  “random  selection.”  Random  selection  is  a  procedure 
through  which  a  sample  of  participants  is  chosen  from  the  population  of 
interest  in  such  a  way  that  each  member  of  the  population  has  an  equal 
probability  of  being  selected  to  participate  in  the  study  (Kazdin,  1992). 
Researchers  using  the  random  selection  procedure  first  define  the  popu¬ 
lation  of  interest  and  then  randomly  select  the  required  number  of  partic¬ 
ipants  from  the  population. 

There  are  two  important  points  to  keep  in  mind  regarding  random 
selection.  The  first  point  is  that  random  selection  is  often  difficult  to  ac¬ 
complish  unless  the  population  is  very  narrowly  defined  (Kazdin,  1992). 
For  example,  random  selection  would  not  be  possible  for  a  population  de¬ 
fined  as  “all  economics  students.”  How  could  we  possibly  define  “all  eco¬ 
nomics  students”?  Would  this  population  include  all  economics  students 
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in  a  particular  state,  or  in  the  United  States,  or  in  the  world?  Would  it  in¬ 
clude  both  current  and  former  economics  students?  Would  it  include  both 
undergraduate  and  graduate  economics  students?  Obviously,  the  popula¬ 
tion  of  “all  economics  students”  is  too  broad,  and  it  would  therefore  be 
impossible  to  select  a  random  sample  from  that  population.  By  contrast, 
random  selection  could  easily  be  accomplished  with  a  population  defined 
as  “all  students  currently  taking  introductory  economics  classes  at  a  par¬ 
ticular  university.”  This  population  is  sufficiently  narrowly  defined,  which 
would  permit  a  researcher  to  use  random  selection  to  obtain  a  representa¬ 
tive  sample. 

As  you  may  have  noticed,  narrowly  defining  the  population  of  interest, 
which  we  have  stated  is  a  requirement  for  random  selection,  has  the  nega¬ 
tive  effect  of  limiting  the  representativeness  of  the  resulting  sample.  This 
certainly  presents  a  catch-22 — we  need  to  narrowly  define  the  population 
to  be  able  to  select  a  representative  sample,  but  by  narrowing  the  popula¬ 
tion,  we  are  limiting  the  representativeness  of  the  sample  we  choose. 

This  brings  us  to  the  second  point  that  you  should  keep  in  mind  re¬ 
garding  random  selection,  namely,  that  the  results  of  a  study  cannot  be 
generalized  based  solely  on  the  random  selection  of  participants  from  the 
population  of  interest.  Rather,  evidence  for  the  generalizability  of  a  study’s 
findings  typically  comes  from  replication  studies.  In  other  words,  the  most 
effective  way  to  demonstrate  the  generalizability  of  a  study’s  findings  is  to 
conduct  the  same  study  with  other  samples  to  see  if  the  same  results  are 
obtained.  Obtaining  the  same  results  with  other  samples  is  the  best  evi¬ 
dence  of  generalizability. 

Despite  the  limitations  that  are  associated  with  random  selection,  it  is  a 
popular  procedure  among  researchers  who  are  attempting  to  ensure  that 
the  sample  of  participants  in  a  particular  study  is  similar  to  the  population 
from  which  the  sample  was  drawn. 

Assigning  Study  Participants  to  Groups 

Once  a  population  has  been  appropriately  defined  and  a  representative 
sample  of  participants  has  been  randomly  selected  from  that  population, 
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the  next  step  involves  assigning  those  participants  to  groups  within  the  re¬ 
search  study — one  of  the  most  important  aspects  of  conducting  research. 
In  fact,  Kazdin  (1992)  regards  the  assignment  of  participants  to  groups 
within  a  research  study  as  “the  central  issue  in  group  research”  (p.  85). 
Therefore,  it  is  important  that  you  understand  how  the  assignment  of  par¬ 
ticipants  is  most  effectively  accomplished  and  how  it  affects  the  types  of 
conclusions  that  can  be  drawn  from  the  results  of  a  research  study. 

There  is  almost  universal  agreement  among  researchers  that  the  most 
effective  method  of  assigning  participants  to  groups  within  a  research 
study  is  through  a  procedure  called  “random  assignment.”  The  philosophy 
underlying  random  assignment  is  similar  to  the  philosophy  underlying 
random  selection  (see  Rapid  Reference  2.10).  Random  assignment  involves  as¬ 
signing  participants  to  groups  within  a  research  study  in  such  a  way  that 
each  participant  has  an  equal  probability  of  being  assigned  to  any  of  the 
groups  within  the  study  (Kazdin,  1992).  Although  there  are  several  ac¬ 
cepted  methods  that  can  be  used  to  effectively  implement  random  assign¬ 
ment,  it  is  typically  accomplished  by  using  a  table  of  random  numbers  that 

determines  the  group  assignment 
for  each  of  the  participants.  (See 
Chapter  5  for  a  discussion  and 
example  of  random-numbers  ta¬ 
bles.)  By  using  a  table  of  random 
numbers,  participants  are  as¬ 
signed  to  groups  within  the  study 
according  to  a  predetermined 
schedule.  In  fact,  group  assign¬ 
ment  is  determined  for  each  par¬ 
ticipant  prior  to  his  or  her  en¬ 
trance  into  the  study  (Kazdin, 
1992). 

Now  that  you  know  how  par¬ 
ticipants  are  most  effectively  as¬ 
signed  to  groups  within  a  study 
(i.e.,  via  random  assignment),  we 
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Random  Selection  vs. 

Random  Assignment 

Random  selection:  Choosing 
study  participants  from  the  popu¬ 
lation  of  interest  in  such  a  way  that 
each  member  of  the  population 
has  an  equal  probability  of  being 
selected  to  participate  in  the 
study. 

Random  assignment:  Assign¬ 
ing  study  participants  to  groups 
within  the  study  in  such  a  way  that 
each  participant  has  an  equal 
probability  of  being  assigned  to 
any  of  the  groups  within  the  study. 
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should  spend  some  time  dis¬ 
cussing  why  random  assignment 
is  so  important  in  the  context  of 
research.  In  short,  random  assign¬ 
ment  is  an  effective  way  of  ensur¬ 
ing  that  the  groups  within  a  re¬ 
search  study  are  equivalent  (see 
Reference  2.11).  More 
specifically,  random  assignment  is 
a  dependable  procedure  for  pro¬ 
ducing  equivalent  groups  because 
it  evenly  distributes  characteristics 
of  the  sample  among  all  of  the 
groups  within  the  study  (see  Kaz- 
din,  1992).  For  example,  rather 
than  placing  all  of  the  participants 
over  age  40  into  one  group,  ran¬ 
dom  assignment  would,  theoreti¬ 
cally  at  least,  evenly  distribute  all 
of  the  participants  over  age  40  among  all  of  the  groups  within  the  research 
study.  This  would  produce  equivalent  groups  within  the  study,  at  least  with 
respect  to  age. 

At  this  point,  you  may  be  wondering  why  it  is  so  important  for  a  re¬ 
search  study  to  consist  of  equivalent  groups.  The  primary  importance  of 
having  equivalent  groups  within  a  research  study  is  to  ensure  that  nuisance 
variables  (i.e.,  variables  that  are  not  under  the  researcher’s  control)  do  not 
interfere  with  the  interpretation  of  the  study’s  results  (Kazdin,  1992).  In 
other  words,  if  you  find  a  difference  between  the  groups  on  a  particular  de¬ 
pendent  variable,  you  want  to  attribute  that  difference  to  the  independent 
variable  rather  than  to  a  baseline  difference  between  the  groups.  Let’s  take 
a  moment  and  explore  what  this  means.  In  most  studies,  variables  such  as 
age,  gender,  and  race  are  not  the  primary  variables  of  interest.  However,  if 
these  characteristics  are  not  evenly  distributed  among  all  of  the  groups 
within  the  study,  they  could  obscure  the  interpretation  of  the  primary  vari- 


— fiap/d Reference  2  // 


Group  Equivalence 

One  of  the  most  important  as¬ 
pects  of  group  research  is  isolating 
the  effects  of  the  independent 
variable. To  accomplish  this,  the 
experimental  group  and  control 
group  should  be  identical,  except 
for  the  independent  variable. The 
independent  variable  would  be 
present  in  the  experimental  group, 
but  not  in  the  control  group.  As¬ 
suming  this  is  the  only  difference 
between  the  two  groups,  any  ob¬ 
served  differences  on  the  depen¬ 
dent  variable  can  reasonably  be  at¬ 
tributed  to  the  effects  of  the 
independent  variable. 
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ables  of  interest  in  the  study.  Let’s  take  a  look  at  a  short  example  that 
should  help  to  clarify  these  concepts. 

A  researcher  interested  in  measuring  the  effects  of  a  new  memory  en¬ 
hancement  strategy  conducts  a  study  in  which  one  group  (i.e.,  the  experi¬ 
mental  group)  is  taught  the  memory  enhancement  strategy  and  the  other 
group  (i.e.,  the  control  group)  is  not  taught  the  memory  enhancement 
strategy.  Then,  all  of  the  participants  in  both  groups  are  administered  a  test 
of  memory  functioning.  At  the  conclusion  of  the  study,  the  researcher 
finds  that  the  participants  who  were  taught  the  new  strategy  performed 
better  on  the  memory  test  than  the  participants  who  were  not  taught  the 
new  strategy.  Based  on  these  results,  the  researcher  concludes  that  the 
memory  enhancement  strategy  is  effective.  However,  before  submitting 
these  impressive  results  for  publication  in  a  professional  journal,  the  re¬ 
searcher  realizes  that  there  is  a  slight  quirk  in  the  composition  of  the  two 
groups  in  the  study.  Specifically,  the  researcher  discovers  that  the  experi¬ 
mental  group  is  composed  entirely  of  women  under  the  age  of  30,  while 
the  control  group  is  composed  entirely  of  men  over  the  age  of  60. 

The  unfortunate  group  composition  in  the  previous  example  is  quite 
problematic  for  the  researcher,  who  is  understandably  disappointed  in  this 
turn  of  events.  Without  getting  too  complicated,  here  is  the  problem  in  a 
nutshell:  Because  the  two  study  groups  differ  in  several  ways — exposure 
to  the  memory  enhancement  strategy,  age,  and  gender — the  researcher 
cannot  be  sure  exactly  what  is  responsible  for  the  improved  memory  per¬ 
formance  of  the  participants  in  the  experimental  group.  It  is  possible,  for 
example,  that  the  improved  memory  performance  of  the  experimental 
group  is  not  due  to  the  new  memory  enhancement  strategy,  but  rather  to 
the  fact  that  the  participants  in  that  group  are  all  under  age  30  and,  there¬ 
fore,  are  likely  to  have  better  memories  than  the  participants  who  are  over 
age  60.  Alternatively,  it  is  possible  that  the  improved  memory  perfor¬ 
mance  of  the  experimental  group  is  somehow  related  to  the  fact  that  all  of 
the  participants  in  that  group  are  women.  In  summary,  because  the  mem¬ 
ory  enhancement  strategy  was  not  experimentally  isolated  and  controlled 
(i.e.,  it  was  not  the  only  difference  between  the  experimental  and  control 
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groups),  the  researcher  cannot  be  sure  whether  it  was  responsible  for  the 
observed  differences  between  the  groups  on  the  memory  test. 

As  stated  earlier  in  this  section,  the  purpose  of  random  assignment  is  to 
distribute  the  characteristics  of  the  sample  participants  evenly  among 
all  of  the  groups  within  the  study.  By  using  random  assignment,  the  re¬ 
searcher  distributes  nuisance  variables  unsystematically  across  all  of  the 
groups  (see  Kazdin,  1992).  Had  the  researcher  in  our  example  used  ran¬ 
dom  assignment,  the  male  participants  over  age  60  and  the  female  partic¬ 
ipants  under  age  30  would  have  been  evenly  distributed  between  the  ex¬ 
perimental  group  and  the  control  group.  (See  Rapid  Reference  2.12  for  a 
discussion  of  testing  for  group  equivalence.) 

If  the  sample  size  is  large  enough,  the  researcher  can  assume  that  the 
nuisance  variables  are  evenly  distributed  among  the  groups,  which  in¬ 
creases  the  researcher’s  confidence  in  the  equivalence  of  the  groups 
(Kazdin,  1992).  This  last  point  should  not  be  overlooked.  Random  as¬ 
signment  is  most  effective  with  a  large  sample  size  (e.g.,  more  than  40  par¬ 
ticipants  per  group) .  In  other  words,  the  likelihood  of  obtaining  equivalent 
groups  increases  as  the  sample  size  increases.  Once  participants  have  been 
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Equivalence  Testing 

Although  using  random  assignment  with  large  samples  can  be  assumed  to 
produce  equivalent  groups,  it  is  wise  to  statistically  examine  whether  the 
two  groups  are  indeed  equivalent.This  is  accomplished  by  comparing  the 
two  groups  on  nuisance  variables  to  see  whether  the  two  groups  differ 
significantly.  If  there  are  no  statistically  significant  differences  between  the 
two  groups  on  any  of  the  nuisance  variables,  the  researcher  can  be  confi¬ 
dent  that  the  two  groups  are  equivalent.  In  this  situation,  any  observed  ef¬ 
fects  on  the  dependent  variables  can  reasonably  be  attributed  to  the  inde¬ 
pendent  variable  (and  not  to  any  of  the  nuisance  variables).  By  contrast,  if 
the  two  groups  are  not  equivalent  on  one  or  more  of  the  nuisance  vari¬ 
ables,  there  are  statistical  steps  that  a  researcher  can  take  to  ensure  that 
the  differences  do  not  affect  the  interpretation  of  the  study’s  results. 
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randomly  assigned  to  groups  within  the  study,  the  researcher  is  then  ready 
to  begin  collecting  data.  (Both  random  selection  and  random  assignment 
will  be  discussed  in  more  detail  in  Chapter  3  as  strategies  for  controlling 
artifact  and  bias.) 

MULTICULTURAL  CONSIDERATIONS 

One  final  and  important  topic  in  this  chapter  is  the  relationship  between 
multicultural  issues  and  research  studies.  In  research,  as  in  most  other  ar¬ 
eas  of  life  at  the  beginning  of  the  21st  century,  considerations  surround¬ 
ing  multiculturalism  (see  Rapid  Reference  2.13)  have  taken  on  increased 
visibility  and  importance.  As  a  result,  there  is  a  growing  need  for  re¬ 
searchers  at  all  levels  and  in  all  settings  to  become  familiar  with  the  role  of 
multiculturalism  in  all  aspects  of  research  studies. 

Multicultural  considerations  are  important  in  two  distinct  ways  when  it 
comes  to  conducting  research  studies.  First,  multicultural  considerations 
often  have  a  considerable  effect  on  a  researcher’s  choice  of  research  ques¬ 
tion  and  research  design  (even  if  the  researcher  is  unaware  of  the  role 
played  by  multicultural  considerations  in  those  decisions).  Second,  multi¬ 
cultural  considerations  are  important  in  the  selection  and  composition 

of  the  sample  of  participants  used 
in  particular  research  studies.  In 
other  words,  multicultural  consid¬ 
erations  are  important  with  re¬ 
spect  to  both  the  researcher  and 
the  study  sample.  This  section  will 
address  both  of  these  important 
considerations. 

Multiculturalism  and 
Researchers 

As  the  population  of  the  United 
States  becomes  increasingly  di- 
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Multiculturalism 

When  considered  in  its  broadest 
sense,  a  researcher  who  has 
achieved  multicultural  competence 
is  cognizant  of  differences  among 
study  participants  related  to  race, 
ethnicity,  language,  sexual  orienta¬ 
tion,  gender,  age,  disability,  class 
status,  education,  and  religious  or 
spiritual  orientation  (American 
Psychological  Association,  2003). 
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verse,  there  is  a  growing  need  for  researchers  to  become  more  aware  of  the 
impact  of  multicultural  issues  on  the  planning  and  designing  of  research 
studies  (Reid,  2002).  Using  the  current  lingo,  it  can  be  stated  that  there 
is  a  need  for  researchers  to  achieve  “multicultural  competence.”  For  re¬ 
searchers,  the  first  step  in  achieving  multicultural  competence  is  becom¬ 
ing  aware  of  how  their  own  worldviews  affect  their  choice  of  research 
questions  (American  Psychological  Association  [APA],  2003).  These 
worldviews  necessarily  include  researchers’  views  of  their  own  cultures  as 
well  as  their  views  of  other  cultures.  Researchers  must  acknowledge  that 
their  worldviews  likely  play  an  integral  role  in  shaping  their  views  of  hu¬ 
man  behavior.  Hence,  their  theories  of  human  behavior,  as  well  as  the  re¬ 
search  questions  and  hypotheses  that  stem  from  those  theories,  are  based 
on  assumptions  particular  to  their  own  culture — and  it  is  these  assump¬ 
tions  of  which  researchers  must  be  aware  (see  Egharevba,  2001). 

To  increase  awareness  of  multicultural  issues  in  the  conceptualization 
of  research  designs,  the  researcher  often  benefits  from  consulting  with 
members  of  diverse  and  traditionally  underrepresented  cultural  groups 
(APA,  2003;  Quintana,  Troyano,  &  Taylor,  2001).  This  serves  the  purpose 
of  providing  perspectives  and  insights  that  may  not  have  otherwise  been 
considered  by  the  researcher  acting  alone.  Considering  different  view¬ 
points  from  members  of  diverse  cultural  groups  facilitates  the  develop¬ 
ment  of  a  culturally  competent  research  design  that  has  the  potential  to 
benefit  people  from  many  different  cultures.  Along  similar  lines,  it  is  also 
important  for  researchers  to  recognize  the  limitations  of  their  research  de¬ 
signs  in  terms  of  applicability  to  diverse  cultural  groups. 

Researchers  also  need  to  be  aware  of  multicultural  considerations  when 
deciding  on  assessment  techniques  and  instruments  for  their  studies.  For 
example,  when  working  with  a  culturally  diverse  sample,  it  is  important 
that  researchers  use  instruments  and  assessment  techniques  that  have 
been  validated  with  culturally  diverse  groups  (see  Council  of  National 
Psychological  Associations  for  the  Advancement  of  Ethnic  Minority  Inter¬ 
ests,  2000).  According  to  the  APA’s  Guidelines  on  Multicultural  Education, 
Training,  Research,  Practice,  and  Organisational  Change  for  Psychologists  (2003, 
p.  389),  “psychological  researchers  are  urged  to  consider  culturally  sensi- 
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tive  assessment  techniques,  data-generating  procedures,  and  standardized 
instruments  whose  validity,  reliability,  and  measurement  equivalence  have 
been  tested  across  culturally  diverse  sample  groups. . . 

Finally,  when  it  comes  to  interpreting  data  and  drawing  conclusions,  re¬ 
searchers  need  to  consider  the  role  of  culture  and  cultural  hypotheses.  It 
is  conceivable,  for  example,  that  there  is  a  culturally  based  explanation  for 
the  research  study’s  findings,  and  it  therefore  may  be  prudent  to  statisti¬ 
cally  examine  relevant  cultural  variables.  Researchers  also  need  to  be  cog¬ 
nizant  of  the  cultural  limitations  and  generalizability  of  the  research 
study’s  results. 

Multiculturalism  and  Study  Participants 

In  the  preceding  section,  we  emphasized  the  importance  of  multicultural 
considerations  in  terms  of  formulating  a  research  question,  choosing  an 
appropriate  research  design,  selecting  assessment  strategies,  and  analyzing 
data  and  drawing  conclusions.  In  this  section,  we  will  focus  on  multicul¬ 
tural  considerations  as  they  relate  to  selecting  the  research  participants 
who  make  up  the  study  sample.  As  you  will  see,  the  inclusion  of  people 
from  diverse  cultural  backgrounds  in  study  samples  has  attracted  a  great 
deal  of  attention  in  recent  years. 

The  debate  regarding  the  appropriate  composition  of  study  samples 
is  no  longer  exclusively  in  the  domain  of  researchers.  The  federal  govern¬ 
ment  has  voiced  an  opinion  on  this  important  issue.  In  1993,  President 
Clinton  signed  into  law  the  NIH  Revitalization  Act  of  1993  (PL  103-43), 
which  directed  the  National  Institutes  of  Health  (NIH)  to  establish  guide¬ 
lines  for  the  inclusion  of  women  and  minorities  in  clinical  research.  On 
March  9, 1994,  in  response  to  the  mandate  contained  in  the  NIH  Revital¬ 
ization  Act,  the  NIH  issued  NIH  Guidelines  on  the  Inclusion  of  Women  and  Mi¬ 
norities  as  Subjects  in  Clinical  Research  (henceforth  “NIH  Guidelines ”). 

According  to  the  NIH  Guidelines,  because  research  is  designed  to  pro¬ 
vide  scientific  evidence  that  could  lead  to  a  change  in  health  policy  or  a 
standard  of  care,  it  is  imperative  to  determine  whether  the  intervention  be- 
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ing  studied  affects  both  genders  as  well  as  diverse  racial  and  ethnic  groups 
differendy.  Therefore,  all  NIH- supported  biomedical  and  behavioral  re¬ 
search  involving  human  participants  is  required  to  be  carried  out  in  a 
manner  that  elicits  information  about  individuals  of  both  genders  and 
from  diverse  racial  and  ethnic  backgrounds.  According  to  the  Office  for 
Protection  From  Research  Risks,  which  is  part  of  the  U.S.  Department  of 
Health  and  Human  Services,  the  inclusion  of  women  and  minorities  in  re¬ 
search  will,  among  other  things,  help  to  increase  the  generalizability  of  the 
study’s  findings  and  ensure  that  women  and  minorities  benefit  from  the 
research.  Although  the  NIH  Guidelines  apply  only  to  studies  conducted  or 
supported  by  the  NIH,  all  other  researchers  and  research  institutions  are 
encouraged  to  include  women  and  minorities  in  their  research  studies,  as 
well. 

SUMMARY 

In  this  chapter,  we  have  covered  the  research-related  issues  that  are  most 
commonly  encountered  by  researchers  when  they  are  planning  and  de¬ 
signing  research  studies.  There  are  certainly  other  topics  related  to  plan¬ 
ning  and  designing  a  research  study  that  we  could  have  included  in  this  dis¬ 
cussion  (e.g.,  choosing  study  instruments),  but  we  chose  to  take  a  broad 
approach  because  of  the  inherent  uniqueness  of  research  studies.  Rather 
than  discussing  topics  that  are  specific  to  specific  types  of  studies,  we  be¬ 
lieved  that  it  would  be  most  beneficial  to  make  the  discussion  more  gen¬ 
eral  by  focusing  on  the  research-related  topics  that  are  encountered  by  vir¬ 
tually  all  researchers  when  planning  and  designing  studies. 
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TEST  YOURSELF  ^ 


1 .  Researchers  become  familiar  with  the  existing  literature  on  a  particular 

topic  by  conducting  a _ . 

2.  Researchers  use _ to  attempt  to  explain,  predict,  and  explore  the 

phenomenon  of  interest. 

3.  The _ hypothesis  always  predicts  that  there  will  be  no  differ¬ 

ences  between  the  groups  being  studied. 

4.  The _ is  a  measure  of  the  effect  (if  any)  of  the  inde¬ 

pendent  variable. 

5.  The  most  effective  method  of  assigning  participants  to  groups  within  a  re¬ 
search  study  is  through  a  procedure  called _ . 

Answers:  I .  literature  review;  2.  hypotheses;  3.  null;  4.  dependent  variable;  5.  random  assign¬ 
ment 
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GENERAL  APPROACHES  FOR 
CONTROLLING  ARTIFACT  AND  BIAS 


In  Chapter  6,  we  will  discuss  the  four  main  types  of  experimental  valid¬ 
ity  and  the  potential  threats  associated  with  each.  These  threats  are  also 
referred  to  as  confounds,  or  sources  of  artifact  and  bias.  Remember  that 
we  conduct  research  to  systematically  study  specified  variables  of  interest. 
Any  variable  that  is  not  of  interest,  but  that  might  influence  the  results,  can 
be  referred  to  as  a  potential  confound,  artifact,  or  source  of  bias.  The  pri¬ 
mary  purpose  of  research  design  is  to  eliminate  these  sources  of  bias  so 
that  more  confidence  can  be  placed  in  the  results  of  the  study.  Identifying 
potential  sources  of  artifact  and  bias  is  therefore  an  essential  first  step  in 
ensuring  the  integrity  of  any  conclusions  drawn  from  the  data  obtained 
during  a  study.  Once  the  threats  are  identified,  appropriate  steps  can  be 
taken  to  reduce  their  impact. 

Unfortunately,  even  the  most  seasoned  researchers  cannot  account  for 
or  foresee  every  potential  source  of  artifact  and  bias  that  might  confound 
the  results  or  be  present  in  a  research  design.  In  this  chapter,  we  will  dis¬ 
cuss  general  strategies  and  controls  that  can  be  used  to  reduce  the  impact 
of  artifact  and  bias.  These  strategies  are  very  useful  in  that  they  help  reduce 
the  impact  of  artifact  and  bias  even  when  the  researcher  is  not  aware  that 
they  exist  in  the  study.  These  strategies  should  be  considered  early  in  the 
design  phase  of  a  research  study.  Early  consideration  allows  the  researcher 
to  take  a  proactive,  preventive  approach  to  potential  artifacts  and  biases 
and  minimizes  the  need  to  be  reactionary  as  problems  arise  later  in  the 
study.  Early  consideration  cannot  be  overemphasized  because  the  worth 
of  the  findings  of  any  research  study  is  directly  related  to  the  reduction  or 
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elimination  of  confounding  sources  of  artifact  and  bias.  Implementing 
these  basic  strategies  also  reduces  threats  to  validity  and  bolsters  the  con¬ 
fidence  we  can  place  in  the  findings  of  a  study. 

A  BRIEF  INTRODUCTION  TO  VALIDITY 

Our  introduction  to  this  chapter  suggests  that  the  purpose  of  research  is 
to  provide  valid  conclusions  regarding  a  wide  range  of  researchable  phe¬ 
nomena.  Although  we  discuss  it  in  detail  in  Chapter  6,  a  brief  discussion  of 
the  concept  of  validity  is  necessary  here  to  frame  our  general  discussion  of 
the  experimental  control  of  artifact  and  bias.  Validity  refers  to  the  concep¬ 
tual  and  scientific  soundness  of  a  research  study  or  investigation,  and  the 
primary  purpose  of  all  forms  of  research  is  to  produce  valid  conclusions. 

Researchers  are  usually  interested  in  studying  the  relationship  of  spe¬ 
cific  variables  at  the  expense  of  other,  perhaps  irrelevant,  variables.  To 
produce  valid,  or  meaningful  and  accurate,  conclusions  researchers  must 
strive  to  eliminate  or  minimize  the  effects  of  extraneous  influences,  vari¬ 
ables,  and  explanations  that  might  detract  from  the  accuracy  of  a  study’s 
ultimate  findings.  Put  simply,  validity  is  related  to  research  methodology 
because  its  primary  purpose  is  to  increase  the  accuracy  and  usefulness  of 
findings  by  eliminating  or  controlling  as  many  confounding  variables  as 
possible,  which  allows  for  greater  confidence  in  the  findings  of  any  given 
study.  Chapter  6  further  discusses  the  main  types  of  validity  and  the  spe¬ 
cific  threats  related  to  each,  so  we  will  not  go  into  any  more  detail  about 
the  subject  in  this  chapter.  The  remaining  material  in  this  chapter  will  dis¬ 
cuss  general  design  strategies  that  can  be  used  to  help  ensure  that  the  con¬ 
clusions  drawn  from  the  results  of  a  study  are  valid. 

SOURCES  OF  ARTIFACT  AND  BIAS 

In  Chapter  6,  we  discuss  the  most  common  threats  to  validity.  The  mater¬ 
ial  in  Chapter  6  is  very  specific  to  the  four  main  types  of  validity  encoun¬ 
tered  in  research  design  and  methodology — internal,  external,  construct, 
and  statistical  conclusion  validity  (see  Rapid  Reference  3.1).  By  contrast, 
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Four  Types  of  Validity 

•  Internal  validity  refers  to  the  ability  of  a  research  design  to  rule  out 
or  make  implausible  alternative  explanations  of  the  results,  or  plausible 
rival  hypotheses.  (A  plausible  rival  hypothesis  is  an  alternative  interpreta¬ 
tion  of  the  researcher’s  hypothesis  about  the  interaction  of  the  depen¬ 
dent  and  independent  variables  that  provides  a  reasonable  explanation 
of  the  findings  other  than  the  researcher’s  original  hypothesis.) 

•  External  validity  refers  to  the  generalizability  of  the  results  of  a  re¬ 
search  study.  In  all  forms  of  research  design,  the  results  and  conclusions 
of  the  study  are  limited  to  the  participants  and  conditions  as  defined  by 
the  contours  of  the  research.  External  validity  refers  to  the  degree  to 
which  research  results  generalize  to  other  conditions,  participants, 
times,  and  places. 

•  Construct  validity  refers  to  the  basis  of  the  causal  relationship  and  is 
concerned  with  the  congruence  between  the  study’s  results  and  the 
theoretical  underpinnings  guiding  the  research.  In  essence,  construct  va¬ 
lidity  asks  the  question  of  whether  the  theory  supported  by  the  find¬ 
ings  provides  the  best  available  explanation  of  the  results. 

•  Statistical  validity  refers  to  aspects  of  quantitative  evaluation  that 
affect  the  accuracy  of  the  conclusions  drawn  from  the  results  of  a  study. 
At  its  simplest  level,  statistical  validity  addresses  the  question  of 
whether  the  statistical  conclusions  drawn  from  the  results  of  a  study 
are  reasonable. 


the  aim  of  this  chapter  is  more  general.  While  Chapter  6  discusses  specific 
artifacts,  biases,  and  confounds  as  they  relate  to  the  four  main  types  of  va¬ 
lidity,  this  chapter  provides  valuable  information  on  general  sources  of  ar¬ 
tifact  and  bias  that  can  exist  in  most  forms  of  research  design.  It  also  pro¬ 
vides  a  framework  for  minimizing  or  eliminating  a  wide  variety  of  these 
confounds  without  directly  addressing  specific  threats  to  validity. 

Although  sources  of  artifact  and  bias  can  be  classified  across  a  number 
of  broad  categories,  these  categories  are  far  from  all-inclusive  or  exhaus¬ 
tive.  The  reason  for  this  is  that  every  research  study  is  distinct  and  is  faced 
with  its  own  unique  sources  of  artifact  and  bias  that  may  threaten  the  va- 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


68  ESSENTIALS  OF  RESEARCH  DESIGN  AND  METHODOLOGY 


lidity  of  its  findings.  In  addition, 
sources  of  artifact  and  bias  can  oc¬ 
cur  in  isolation  or  in  combination, 
further  compounding  the  poten¬ 
tial  threats  to  validity.  Researchers 
must  be  aware  of  these  potential 
threats  and  control  for  them  ac¬ 
cordingly.  Failure  to  implement 
appropriate  controls  at  the  outset 
of  a  study  may  substantially  re¬ 
duce  the  researcher’s  ability  to 
draw  confident  inferences  of 
causality  from  the  study  findings.  Fortunately,  there  are  several  ways  that 
the  researcher  can  control  for  the  effects  of  artifact  and  bias.  The  most  ef¬ 
fective  methods  include  the  use  of  statistical  controls,  control  and  com¬ 
parison  groups,  and  randomization  (a  more  complete  list  is  found  in  Rapid 
Reference  3.2). 

A  short  discussion  of  sources  of  artifact  and  bias  is  necessary  before  we 
can  address  methods  for  minimizing  or  eliminating  their  impact  on  the 
validity  of  study  findings.  As  mentioned,  the  types  of  potential  sources  of 
artifact  and  bias  are  virtually  endless — for  example,  the  heterogeneity  of 
research  participants  alone  can  contribute  innumerable  sources.  Research 
participants  bring  a  wide  variety  of  physical,  psychological,  and  emotional 
traits  into  the  research  context.  These  different  characteristics  can  direcdy 
affect  the  results  of  a  study.  Similarly,  an  almost  endless  array  of  environ¬ 
mental  factors  can  influence  a  study’s  results.  For  example,  consider  what 
your  level  of  attention  and  or  motivation  might  be  like  in  an  excessively 
warm  classroom  versus  one  that  is  comfortable  and  conducive  to  learning. 
As  you  will  see  in  Chapter  4,  measurement  issues  can  also  introduce  arti¬ 
fact  and  bias  into  the  study.  The  use  of  poorly  validated  or  unreliable  mea¬ 
surement  strategies  can  contribute  to  misleading  results  (Leary,  2004; 
Rosenthal  &  Rosnow,  1969).  To  make  matters  worse,  sources  of  artifact 
and  bias  can  also  combine  and  interact  (e.g.,  as  when  one  is  taking  a  poorly 
validated  test  in  an  uncomfortable  classroom)  to  further  reduce  the  valid- 
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Methods  for  Controlling 
Sources  of  Artifact 
and  Bias 

•  Statistical  controls 

•  Control  and  comparison  groups 

•  Random  selection 

•  Random  assignment 

•  Experimental  design 
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ity  of  study  findings.  Despite  the  potentially  infinite  types  and  combina¬ 
tions  of  artifact  and  bias,  they  can  generally  be  seen  as  falling  into  one  of 
several  primary  categories. 


Experimenter  Bias 

Ironically,  the  researchers  themselves  are  the  first  common  source  of  arti¬ 
fact  and  bias  (Kintz,  Delprato,  Mettee,  Persons,  &  Shappe,  1965).  Fre- 
quendy  called  experimenter  bias  this  source  of  artifact  and  bias  refers  to  the 
potential  for  researchers  themselves  to  inadvertendy  influence  the  behav¬ 
ior  of  research  participants  in  a  certain  direction  (Adair,  1973;  Beins, 
2004) .  In  other  words,  a  researcher 
who  holds  certain  beliefs  about 
the  nature  of  his  or  her  research 
and  how  the  results  will  or  should 
turn  out  may  intentionally  or  un¬ 
intentionally  influence  the  out¬ 
come  of  the  study  in  a  way  that  fa¬ 
vors  his  or  her  expected  outcome 
(Barber  &  Silver,  1968);  the 
Rosenthal  and  Pygmalion  effects 
(see  Rapid  Reference  3.3)  are  ex¬ 
amples. 

Experimenter  bias  can  mani¬ 
fest  itself  across  a  wide  variety  of 
circumstances  and  settings.  For 
example,  a  researcher  might  inter¬ 
pret  data  in  such  a  way  that  it  sup¬ 
ports  his  or  her  theoretical  orien¬ 
tation  or  a  particular  theoretical 
paradigm.  Similarly,  the  re¬ 
searcher  might  be  tempted  to 
change  the  original  research  hy¬ 
potheses  to  fit  the  actual  data 


^  ftap/d  Reference  SJ 

The  Rosenthal  and 
Pygmalion  Effects 

The  Rosenthal  and  Pygmalion  ef¬ 
fects  are  examples  of  experi¬ 
menter  bias.  Both  of  these  terms 
refer  to  the  documented  phenom¬ 
enon  that  researchers’  expecta¬ 
tions  (rather  than  the  experimen¬ 
tal  manipulation)  can  bias  the 
outcome  of  study  by  influencing 
the  behavior  of  their  participants. 

DON'T  FORGET 


Experimenter  Bias 

Experimenter  bias  exists  when  re¬ 
searchers  inadvertently  influence 
the  behavior  of  research  partici¬ 
pants  in  a  way  that  favors  the  out¬ 
comes  they  anticipate. 
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when  it  becomes  apparent  that  the  data  do  not  support  the  original  hy¬ 
potheses.  A  related  bias  occurs  when  researchers  blatantly  ignore  findings 
that  do  not  support  their  hypotheses.  Other,  more  innocuous  examples 
include  subtle  errors  in  data  collection  and  recording  and  unintentional 
deviations  from  standardized  procedures.  These  biases  are  particularly 
prevalent  in  studies  in  which  a  single  researcher  is  responsible  for  gener¬ 
ating  the  hypotheses,  designing  the  study,  and  collecting  and  analyzing  the 
data  (Barber,  1976).  Let’s  now  consider  how  experimenter  bias  might 
specifically  manifest  itself  in  the  context  of  research  methodology. 

Consider  an  example  in  which  a  researcher  is  studying  the  efficacy  of 
different  types  of  psychotherapy.  The  study  is  comparing  three  different 
types  of  therapy,  and  our  researcher  has  a  personal  belief  that  one  of  the 
three  is  superior  to  the  other  two  treatments.  Our  researcher  is  involved  in 
conducting  screening  assessments  of  symptom  levels,  and  based  on  those 
results,  assigns  participants  to  the  different  treatment  conditions.  The  re¬ 
searcher’s  personal  interest  in  one  particular  form  of  therapy  might  lead  to 
the  introduction  of  a  potential  source  of  artifact  or  bias.  For  example,  if  the 
researcher  thinks  that  his  or  her  therapeutic  preference  is  superior,  then 
individuals  with  greater  symptom  levels  might  be  unconsciously  (or  inad- 
vertendy)  assigned  to  that  treatment  group.  Here,  the  underlying  bias 
might  be  that  a  superior  form  of  treatment  is  necessary  to  help  the  partic¬ 
ipants  in  question.  This  could  work  in  the  other  direction  as  well,  when  the 
researcher  unconsciously  (or  inadvertently)  assigns  participants  with  low 
symptom  levels  to  the  treatment  of  choice.  Either  approach  can  bias  the 
results  and  blur  the  findings  as  they  relate  to  the  relationship  between  the 
intervention  and  symptom  level,  or  independent  and  dependent  variables. 

A  subtler  example  could  simply  be  the  fact  that  the  researcher  uncon¬ 
sciously  treats  some  participants  differently  from  others  during  the  ad¬ 
ministration  of  the  screening  or  other  aspects  of  the  treatment  interven¬ 
tions.  Perhaps  the  researcher  is  having  a  particularly  bad  or  stressful  day 
and  is  not  as  engaging  or  amiable  as  he  or  she  might  otherwise  be  while  in¬ 
teracting  with  the  participants.  Participants  might  feel  somewhat  different 
after  interacting  with  the  researcher  and  this  might  have  an  impact  on  their 
self-report  of  symptoms  or  their  attitudes  toward  engaging  in  the  study. 
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Another  example  of  experimenter  bias  is  related  to  training  and  so¬ 
phistication.  Like  people  in  general,  researchers  possess  varying  levels  of 
knowledge  and  sophistication,  which  can  have  a  significant  impact  on  any 
study.  Consider  our  previous  therapy  example.  Let’s  assume  that  three 
different  researchers  are  conducting  the  therapeutic  interventions.  One 
researcher  has  20  years  of  experience,  the  other  has  10,  and  the  final  one 
is  just  out  of  graduate  school  and  has  little  practical  experience.  Any  re¬ 
sults  that  we  might  obtain  from  this  study  might  be  a  reflection  more  of 
therapist  experience  than  of  the  nature  and  effectiveness  of  the  three  dif¬ 
ferent  types  of  therapy.  Although  subtle,  experimenter  biases  can  have  a 
significant  impact  on  the  validity  of  the  research  findings  because  they 
can  blur  the  relationship  between  the  independent  and  dependent  vari¬ 
ables. 

Controlling  Experimenter  Bias 

As  just  mentioned,  experimenter  bias  can  have  substantial  negative  im¬ 
pacts  on  the  overall  validity  of  a  study.  Fortunately,  there  are  a  number  of 
strategies  (listed  in  Rapid  Reference  3.4)  that  can  be  employed  to  minimize 
the  impact  of  these  biases. 

The  first  strategy  is  to  maintain  careful  control  over  the  research  pro¬ 
cedures.  The  goal  of  this  approach  is  to  hold  study  procedures  constant, 
in  an  attempt  to  minimize  unforeseen  variance  in  the  research  design.  In 
other  words,  all  procedures  should  be  carefully  standardized.  This  might 
include  the  use  of  manualized  study  procedures,  standardized  instru¬ 
ments,  and  uniform  scripts  for  interacting  with  research  participants. 
Some  studies  go  so  far  as  to  try  to  anticipate  participant  questions  and  be¬ 
haviors  and  script  out  appropriate  responses  for  researchers  to  follow. 

Typically,  this  type  of  control  is  limited  to  the  recruitment  and  assess¬ 
ment  of  participants  and  to  the  giving  of  standardized  instructions 
throughout  the  study.  Inclusion  criteria  and  standards  are  usually  devel¬ 
oped  to  ensure  that  only  appropriate  participants  are  included  in  the  study. 
Achieving  this  type  of  control  is  more  difficult  than  it  might  sound.  Re¬ 
member  that  research  participants  bring  a  wide  range  of  individual  differ¬ 
ences  to  any  research  study.  Despite  this,  there  are  other  steps  related  to 
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Strategies  for  Minimizing  Experimenter  Effects 

•  Carefully  control  or  standardize  all  experimental  procedures. 

•  Provide  training  and  education  on  the  impact  and  control  of  experi¬ 
menter  effects  to  all  of  the  researchers  involved  in  the  study. 

•  Minimize  dual  or  multiple  roles  within  the  study. 

•  When  multiple  researcher  roles  are  necessary  provide  appropriate 
checks  and  balances  and  quality  control  procedures,  whenever  pos¬ 
sible. 

•  Automate  procedures,  whenever  possible. 

•  Conduct  data  collection  audits  and  ensure  accuracy  of  data  entry. 

•  Consider  using  a  statistical  consultant  to  ensure  impartiality  of  results 
and  choice  of  appropriate  statistical  analyses. 

•  Limit  the  knowledge  that  the  researcher  or  researchers  have  regarding 
the  nature  of  the  hypotheses  being  tested,  the  experimental  manipula¬ 
tion,  and  which  participants  are  either  receiving  or  not  receiving  the 
experimental  manipulation. 

constancy  that  researchers  can  employ  to  minimize  the  impact  of  experi¬ 
menter  bias. 

One  of  the  more  common  approaches  to  achieving  constancy  is  to  pro¬ 
vide  training  and  education  on  the  impact  and  control  of  experimenter  ef¬ 
fects  to  all  of  the  researchers  involved  in  the  study.  Although  it  has  been 
said  that  ignorance  is  bliss,  this  is  usually  not  the  case  in  research  design. 
Ignorance  of  the  potential  impact  of  researcher  behavior  and  attitudes  on 
the  results  of  a  study  is  a  common  source  of  bias  that  can  be  easily  ad¬ 
dressed  through  education  and  training.  Awareness  of  the  potential  impact 
of  behavior  is  usually  the  first  step  in  making  sure  that  the  behavior  does 
not  go  unregulated  or  unchecked  in  a  research  context.  Training  and  edu¬ 
cation  are  essential  when  there  are  varying  levels  of  expertise  among  re¬ 
searchers  or  when  the  researchers  have  enlisted  the  help  of  support  staff 
who  possess  little  experience  in  conducting  research.  At  a  minimum,  train- 
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ing  in  this  area  should  include  a  discussion  of  the  most  common  types  of 
experimenter  effects  and  how  they  are  best  minimized  or  eliminated. 

As  noted  previously,  there  are  numerous  types  of  experimenter  effects 
that  can  bias  the  results  of  a  study.  Some  can  be  minimized  through  aware¬ 
ness  and  training,  and  others  through  standardized  procedures.  We  also 
mentioned  that  experimenter  effects  might  be  more  prevalent  when  one 
individual  is  acting  in  multiple  roles  within  the  study.  This  is  particularly 
true  in  smaller  studies  for  which  funding  and  resources  are  limited,  such  as 
graduate  school  dissertation  research. 

The  problem  that  this  might  produce  in  light  of  experimenter  effects  is 
an  apparent  one:  temptation.  The  solution  is  relatively  simple — use  mul¬ 
tiple  researchers  and  provide  appropriate  checks  and  balances  and  quality 
control  procedures  whenever  possible.  It  might  also  be  helpful  to  divide 
responsibilities  in  a  way  that  minimizes  possible  confounds  and  tempta¬ 
tions  to  act  in  a  way  that  might  be  inconsistent  with  drawing  valid  conclu¬ 
sions  from  the  results  of  the  study.  Let’s  consider  some  examples. 

Checks  and  balances,  or  quality  control  procedures,  are  essential  for 
eliminating  potential  experimenter  biases.  As  discussed  previously,  stan¬ 
dardized  procedures  are  the  first  step  in  ensuring  the  strength  of  a  research 
design.  Participant  inclusion  criteria,  scripts,  standardized  interventions, 
and  control  of  the  experimental  environment  are  all  examples  of  stan¬ 
dardizing  various  aspects  of  a  research  design.  There  are  other  steps  re¬ 
lated  to  standardization  that  can  be  taken  to  further  bolster  validity  and 
minimize  potential  experimenter  effects.  Unfortunately,  many  of  these  ap¬ 
proaches  are  labor  intensive  and  require  multiple  researchers.  When  the 
inclusion  of  multiple  researchers  is  not  possible,  informal  consultation 
with  knowledgeable  colleagues  should  be  utilized  whenever  possible. 

Most  studies  begin  with  developing  the  research  question,  construction 
of  the  research  design,  and  generation  of  hypotheses.  Having  multiple  re¬ 
searchers  involved  in  planning  a  research  study  brings  a  diversity  of  views 
and  opinions  that  should  minimize  the  likelihood  of  a  poorly  conceptual¬ 
ized  research  design.  With  an  effective  and  appropriate  design  in  place, 
multiple  researchers  can  also  be  used  to  ensure  that  other  aspects  of  the 
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study  are  executed  in  a  way  that  helps  minimize  or  eliminate  experimenter 
bias.  For  example,  multiple  researchers  could  develop  participant  inclu¬ 
sion  and  exclusion  criteria.  Similarly,  participant  inclusion  might  be  de¬ 
pendent  on  agreement  by  two  or  more  researchers  as  to  whether  the  par¬ 
ticipant  meets  the  required  criteria. 

Multiple  researchers  can  also  act  as  a  quality  control  mechanism  for  the 
actual  delivery  of  the  intervention,  or  independent  variable.  Again,  more 
than  one  researcher  might  be  involved  in  designing  the  intervention  re¬ 
lated  to  the  independent  variable,  and  then  in  confirming  that  the  inter¬ 
vention  was  actually  delivered  to  the  participants  in  the  required  fashion. 

Data  collection  and  analysis  is  another  area  where  multiple  researchers 
can  be  an  asset  to  minimizing  or  eliminating  experimenter  bias.  Audits  can 
be  conducted  to  determine  whether  mistakes  were  made  in  the  data  col¬ 
lection  or  data  entry  processes.  Similarly,  multiple  researchers  can  help  en¬ 
sure  that  the  correct  statistical  analyses  are  conducted  and  that  the  results 
are  reported  in  an  accurate  manner  (O’Leary,  Kent,  &  Kanowitz,  1975).  A 
statistical  expert  should  be  consulted  whenever  there  is  uncertainty  about 
which  statistical  approaches  might  best  be  used  to  answer  the  research 
question.  Finally,  this  approach  can  be  useful  in  the  communication  of  the 
results  of  the  study  because  multiple  authors  bring  a  more  diverse  view  to 
the  conceptualization,  interpretation,  and  application  of  the  findings. 

There  are  other  methodological  approaches  that  allow  us  to  further 
minimize  the  impact  of  experimenter  bias.  Recall  from  previous  para¬ 
graphs  that  knowledge  about  the  research  hypotheses  and  the  nature  of 
the  experimental  manipulation  has  the  potential  to  inappropriately  influ¬ 
ence  or  bias  the  outcome  of  a  study.  It  makes  intuitive  sense  that  limiting 
this  knowledge  (if  permitted  by  the  specific  research  design)  might  have  a 
positive  impact  on  the  validity  of  the  conclusions  drawn  from  the  study 
because  it  might  help  to  further  minimize  the  potential  impact  of  experi¬ 
menter  effects. 

There  are  three  main  approaches  or  procedures  for  limiting  the  knowl¬ 
edge  that  researchers  have  regarding  the  nature  of  the  hypotheses  being 
tested,  of  the  experimental  manipulation,  and  of  which  participants  are 
either  receiving  or  not  receiving  the  experimental  manipulation  (Chris- 
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tensen,  2004;  Graziano  &  Raulin,  2004).  Each  of  these  procedures  seeks 
to  reduce  or  minimize  the  researcher’s  knowledge  about  the  participants 
and  about  which  experimental  conditions  they  are  assigned  to  (Graziano 
&  Raulin). 

The  first  approach  is  referred  to  as  the  double-blind  technique,  which  is  the 
most  powerful  method  for  controlling  experimenter  expectancy  and  re¬ 
lated  bias.  This  procedure  requires  that  neither  the  participants  nor  the  re¬ 
searchers  know  which  experimental  or  control  condition  the  participants 
are  assigned  to  (Leary,  2004).  This  often  requires  that  the  study  be  super¬ 
vised  by  a  person  who  tracks  assignment  of  participants  without  inform¬ 
ing  the  main  researchers  of  their  status  (Rosenthal,  Persinger,  Vikan- 
Kline,  &  Mulry,  1963).  Without  this  knowledge,  it  will  be  very  difficult  for 
the  other  researchers  to  either  intentionally  or  inadvertentiy  introduce 
experimenter  bias  into  the  study. 

For  a  variety  of  reasons,  it  is  often  not  practical  or  appropriate  to  use  a 
double-blind  procedure.  This  leads  us  to  a  discussion  of  the  second  most 
effective  approach  for  controlling  experimenter  bias:  the  blind  technique. 
The  blind  technique  requires  that  the  researcher  be  kept  “blind”  or  naive 
regarding  which  treatment  or  control  conditions  the  participants  are  in 
(Christensen,  1988).  As  with  the  double-blind  technique,  someone  other 
than  the  researcher  assigns  the  participants  to  the  required  control  or 
experimental  conditions  without  revealing  the  information  to  the  re¬ 
searcher. 

If  either  the  double-blind  or  blind  technique  is  inappropriate  or  im¬ 
practical,  the  researcher  can  resort  to  a  third  approach  to  minimizing  ex¬ 
perimenter  bias.  The  final  method  for  accomplishing  this  is  known  as  the 
partial-blind  technique,  which  is  similar  to  the  blind  technique  except  that  the 
researcher  is  kept  naive  regarding  participant  selection  for  only  a  portion 
of  the  study.  Most  commonly,  the  researcher  is  kept  naive  throughout  par¬ 
ticipant  selection  and  assignment  to  either  control  or  experimental  condi¬ 
tions  (Christensen,  1988). 

These  three  approaches — double-blind,  blind,  and  partial-blind — are 
summarized  in  Rapid  Reference  3.5.  We  will  return  to  the  topic  of  experi¬ 
menter  bias  in  Chapter  5. 
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Approaches  for  Limiting  Researchers’  Knowledge  of 
Participant  Assignment 

•  Double-blind  technique:The  most  powerful  method  for  controlling 
researcher  expectancy  and  related  bias,  this  procedure  requires  that 
neither  the  participants  nor  the  researchers  know  which  experimental 
or  control  condition  research  participants  are  assigned  to. 

•  Blind  technique:This  procedure  requires  that  only  the  researcher  be 
kept  “blind”  or  naive  regarding  which  treatment  or  control  conditions 
the  participants  are  in. 

•  Partial-blind  technique: This  procedure  is  similarto  the  blind  tech¬ 
nique,  except  that  the  researcher  is  kept  naive  regarding  participant  se¬ 
lection  for  only  a  portion  of  the  study. 


Participant  Effects 

As  just  discussed,  experimenter  effects  are  a  potential  source  of  bias  in  any 
research  study.  If  the  researchers  can  be  a  significant  source  of  artifact  and 
bias,  then  it  makes  both  intuitive  and  practical  sense  that  the  participants 
involved  in  a  research  project  can  also  be  a  significant  source  of  artifact 
and  bias.  Accordingly,  we  will  now  discuss  a  second  common  form  of  ar¬ 
tifact  and  bias  that  can  introduce  significant  confounds  into  a  research  de¬ 
sign  if  not  properly  controlled. 
This  source  of  artifact  and  bias  is 
most  commonly  referred  to  as 
“participant  effects.” 

As  the  name  implies,  the  partic¬ 
ipants  involved  in  a  research  study 
can  be  a  significant  source  of  arti¬ 
fact  and  bias.  Just  like  researchers, 
they  bring  their  own  unique  sets 
of  biases  and  perceptions  into  the 
research  setting.  Put  simply, partic- 
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Participant  Effects 

Participant  effects  are  a  source  of 
artifact  and  bias  stemming  from  a 
variety  of  factors  related  to  the 
unique  motives,  attitudes,  and  be¬ 
haviors  that  participants  bring  to 
any  research  study. 


APPROACHES  FOR  CONTROLLING  ARTIFACT  AND  BIAS  77 


ipant  effects  refers  to  a  variety  of 
factors  related  to  the  unique  mo¬ 
tives,  attitudes,  and  behaviors  that 
participants  bring  to  any  research 
study  (Kruglanski,  1975;  Orne, 

1962).  For  example,  is  the  partici¬ 
pant  anxious  about  the  process, 
eager  to  please  the  researcher,  or 
motivated  by  the  fact  that  he  or 
she  is  being  compensated  for  par¬ 
ticipation?  Do  the  participants 
think  they  have  figured  out  the 
purpose  of  the  study,  and  are  they 
acting  accordingly?  In  other 
words,  are  the  participants,  either 
consciously  or  unconsciously,  al¬ 
tering  their  behavior  to  the  de¬ 
mands  of  the  research  setting? 

(See  Rapid  Reference  3.6). 

In  this  regard,  participant  effects  are  very  similar  to  experimenter  ef¬ 
fects  because  they  are  simply  the  expression  of  individual  differences,  pre¬ 
dispositions,  and  biases  imposed  upon  the  context  of  a  research  design. 
Often,  participants  are  unaware  of  their  own  attitudes,  predispositions, 
and  biases  in  their  day-to-day  lives,  let  alone  in  the  carefully  controlled 
context  of  a  research  study. 

The  impact  of  participant  effects  has  been  thoroughly  researched  and 
well  documented.  At  the  broadest  level  of  conceptualization,  research 
suggests  that  the  level  of  participant  motivation  and  behavior  changes 
simply  as  a  result  of  the  person’s  being  involved  in  a  research  study.  This 
phenomenon  is  most  commonly  referred  to  as  the  Hawthorne  effect.  The 
term  “Hawthorne  effect”  was  coined  as  a  result  of  a  series  of  studies  that 
lent  support  to  the  proposition  that  participants  often  change  their  be¬ 
havior  merely  as  a  response  to  being  observed  and  to  be  helpful  to  the  re¬ 
searcher.  There  are  numerous,  more  specific  ways  that  participant  effects 
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Participant  Effects  by  Any 
Other  Name  .  .  . 

Participant  effects  are  also  re¬ 
ferred  to  as  “demand  characteris¬ 
tics.”  Demand  characteristics  are 
the  tendencies  of  research  partici¬ 
pants  to  act  differently  than  they 
normally  might  simply  because 
they  are  taking  part  in  a  study.  At 
their  most  severe,  demand  charac¬ 
teristics  are  changes  in  behavior 
that  are  based  on  assumptions 
about  the  underlying  purpose  of 
the  study,  which  can  introduce  a 
significant  confound  into  the 
study's  findings. 
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could  manifest  themselves  in  the  context  of  a  research  design.  Many  of 
these  manifestations  are  direcdy  related  to  the  different  roles  that  a  par¬ 
ticipant  might  assume  within  the  context  of  the  research  study. 

Consider  for  a  moment  that  most  participants  in  research  studies  are 
volunteers  (Rosen,  1970;  Rosnow,  Rosenthal,  McConochie,  &  Arms, 
1969).  As  such,  these  individuals  might  be  different  from  other  people 
who  decide  not  to  participate  or  do  not  have  the  opportunity  to  partici¬ 
pate  in  the  study.  This  is  further  confounded  by  the  fact  that  a  significant 
amount  of  research  is  conducted  on  college  undergraduates  enrolled  in 
introductory-level  psychology  courses.  Often,  participation  in  research  is 
tied  to  course  credit  or  some  other  form  of  external  motivation  or  reward. 
Accordingly,  volunteer  participants  might  be  different  from  the  general 
population  as  a  whole,  and  the  conclusions  drawn  from  the  study  might  be 
limited  to  this  specific  population.  Therefore,  even  volunteer  status  may 
result  in  a  participant  effect  because  volunteers  are  a  unique  subset  of  the 
population  with  distinct  characteristics  that  can  have  a  significant  impact 
on  the  results  of  the  study. 

Some  commentators  have  taken  the  concept  of  participant  effects  to  an 
even  more  refined  level  by  identifying  the  different  “roles”  that  a  partici¬ 
pant  might  consciously  or  unconsciously  adopt  in  the  context  of  a  re¬ 
search  study  (Rosnow,  1970;  Sigall,  Aronson,  &  Van  Hoose,  1970;  Spin¬ 
ner,  Adair,  &  Barnes,  1977).  Although  there  is  some  disagreement  about 
the  existence  and  exact  classification  of  participant  roles,  the  most  com¬ 
monly  discussed  roles  include  the  “good,”  the  “negativistic,”  the  “faith¬ 
ful,”  and  the  “apprehensive”  participant  roles  (Kazdin,  2003c;  Weber  & 
Cook,  1972). 

The  “good”  participant  might  attempt  to  provide  information  and  re¬ 
sponses  that  might  be  helpful  to  the  study,  while  the  “negativistic”  partic¬ 
ipant  might  try  to  provide  information  that  might  confound  or  undermine 
it.  The  “faithful”  participant  might  try  to  act  without  bias,  while  the  “ap¬ 
prehensive”  participant  might  try  to  distort  his  or  her  responses  in  a  way 
that  portrays  him  or  her  in  an  overly  positive  or  favorable  light  (Kazdin, 
2003c).  Regardless  of  the  role  or  origin,  participant  effects,  either  alone  or 
in  combination,  can  have  a  direct  impact  on  the  attitudes  of  research  par- 
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ticipants,  which  in  turn  can  have  an  impact  on  the  overall  validity  of  the 
study.  Specifically,  participant  effects  can  undermine  both  the  internal  and 
external  validity  of  a  study.  Internal  and  external  validity  are  discussed  in 
detail  in  Chapter  6. 

Controlling  Participant  Effects 

As  with  experimenter  effects,  researchers  should  consider  and  attempt  to 
control  for  the  impact  of  participant  effects.  And,  as  with  the  sources  of 
bias,  the  potential  impact  of  these  effects  should  be  considered  early  on 
during  the  design  phase  of  the  study.  Conveniently,  one  of  the  methods  for 
controlling  participant  effects  is  exacdy  the  same  as  one  for  controlling 
experimenter  effects,  namely,  the  use  of  the  double-blind  technique.  Re¬ 
member  that  this  procedure  requires  that  neither  the  participants  nor  the 
researchers  know  which  experimental  or  control  conditions  the  partici¬ 
pants  are  assigned  to.  Without  this  knowledge,  it  would  be  difficult  for  par¬ 
ticipants  to  alter  their  behavior  in  ways  that  would  be  related  to  the  exper¬ 
imental  conditions  to  which  they  were  assigned.  This  approach,  however, 
would  still  not  prevent  a  participant  from  adopting  one  of  the  precon¬ 
ceived  participant  roles  we  discussed  previously. 

Deception  is  another  relatively  common  method  for  controlling  partici¬ 
pant  effects.  The  use  of  deception  should  not  be  taken  lightly  because 
there  are  potential  ethical  issues  that  should  be  considered  before  pro¬ 
ceeding.  At  a  minimum,  deception  cannot  jeopardize  the  well-being  of  the 
study  participants,  and  at  the  conclusion  of  the  study,  researchers  are  usu¬ 
ally  required  to  explain  to  the  participants  why  deception  was  used.  When 
researchers  use  deception,  it  usually  takes  the  form  of  providing  partici¬ 
pants  with  misinformation  about  the  true  hypotheses  of  interest  or  the 
focus  of  the  study  (see  Christensen,  2004) .  Without  knowledge  of  the  true 
hypotheses,  it  is  much  more  difficult  for  participants  to  alter  their  behav¬ 
iors  in  ways  that  either  support  or  refute  the  research  hypotheses. 

Double-blind  and  deception  techniques  are  common  ways  of  control¬ 
ling  for  participant  effects,  and  these  approaches  operate  by  altering  the 
knowledge  available  to  the  participants.  One  drawback  to  these  approaches 
is  that  the  researchers  will  never  know  for  certain  whether  their  attempts  at 
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Use  Deception  Cautiously  and  Only  Under 
Appropriate  Circumstances! 

The  use  of  deception  in  research  design  is  controversial  and  should  not 
be  undertaken  without  serious  consideration  of  the  possible  implications 
and  consequences.  Certain  ethical  codes  and  federal  rules  and  regulations 
are  very  clear  that  the  potential  gains  of  using  deception  in  research  must 
be  balanced  against  potential  negative  consequences  and  effects  on  the 
participants.  Generally,  the  use  of  deception  must  be  justified  in  the  con¬ 
text  of  the  research  study’s  possible  scientific,  educational,  or  applied 
value.  In  addition,  the  researchers  must  consider  other  approaches  and 
demonstrate  that  the  research  question  necessarily  involves  the  use  of 
deception.  Researchers  must  never  use  deception  when  providing  infor¬ 
mation  about  the  possible  risks  and  benefits  of  participating  in  the  study 
or  in  obtaining  the  informed  consent  of  the  research  participants. 


control  were  successful  or  what  the  participants  were  actually  thinking  as 
they  progressed  through  the  various  aspects  of  the  research  study.  Fortu¬ 
nately,  there  is  one  more  approach  for  controlling  for  participant  effects 
that  allows  the  researchers  to  gather  information  about  participant  atti¬ 
tudes  and  behavior  as  they  progress  through  the  research  study. 

This  third  approach  is  straightforward  and  focuses  on  a  process  of  in¬ 
quiry.  The  researchers  can  simply  ask  the  participants  about  any  number  of 
issues  related  to  participant  effects  and  the  overall  purpose  and  hypothe¬ 
ses  of  the  study.  Typically,  the  researchers  will  ask  questions  related  to  the 
hypotheses  and  the  natures  of  the  roles  adopted  by  the  participants.  The 
timing  of  the  questioning  can  vary.  For  example,  participants  might  be 
asked  about  specific  or  essential  aspects  of  the  study  in  a  retrospective 
fashion,  after  they  have  completed  the  study.  On  the  other  hand,  the  re¬ 
searchers  might  decide  to  question  participants  concurrently,  throughout 
the  course  of  the  study.  The  choice  of  approach  is  up  to  the  researchers. 
Regardless  of  timing,  the  intent  of  this  approach  is  to  allow  the  researchers 
to  gather  information  directly  from  the  participants  regarding  role,  moti¬ 
vation,  and  behavior  (Christensen,  2004).  This  information  can  then  be 
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controlled  for  in  the  statistical  analysis  or  used  to  remove  a  certain  partic¬ 
ipant’s  data  from  the  analysis. 

ACHIEVING  CONTROL  THROUGH  RANDOMIZATION: 
RANDOM  SELECTION  AND  RANDOM  ASSIGNMENT 

Our  discussion  so  far  has  focused  on  approaches  for  controlling  two  com¬ 
mon  sources  of  potential  artifact  and  bias,  namely,  experimenter  and  par¬ 
ticipant  effects.  Although  important,  these  two  types  of  artifact  and  bias 
represent  only  a  very  limited  number  of  potential  sources  of  artifact  and 
bias  that  should  be  controlled  for  in  a  research  study.  Other  types  of  arti¬ 
fact  and  bias  can  come  from  a  variety  of  sources  and  are  unique  to  the  re¬ 
search  design  in  question.  We  discuss  these  other  types  of  artifacts  and  bi¬ 
ases  in  detail  in  Chapter  6. 

Controlling  and  minimizing  these  sources  of  artifact  and  bias  is  directly 
related  to  the  quality  of  any  study  and  it  bolsters  the  confidence  we  can 
have  in  the  accuracy  and  relevance  of  the  results.  In  an  ideal  world,  re¬ 
searchers  would  be  able  to  eliminate  all  extraneous  influences  from  the 
contexts  of  their  studies.  That  is  the  ultimate  goal,  but  one  that  no  research 
study  will  likely  ever  obtain.  As  you  can  imagine,  eliminating  all  sources  of 
artifact  and  bias  is  virtually  impossible.  Fortunately,  there  are  other  meth¬ 
ods  that  can  be  used  to  help  researchers  control  for  the  influence  of  ex¬ 
traneous  variables  that  do  not  require  the  a  priori  identification  and  elim¬ 
ination  of  all  potential  sources  of  artifact  and  bias.  The  most  powerful  and 
effective  method  for  minimizing  the  impact  of  extraneous  variables  and 
ensuring  the  internal  and  external  validity  of  a  research  study  is  random¬ 
ization. 

Randomisation  is  a  control  method  that  helps  to  ensure  that  extraneous 
sources  of  artifact  and  bias  will  not  confound  the  validity  of  the  results  of 
the  study.  In  other  words,  randomization  helps  ensure  the  internal  validity 
of  the  study  by  helping  to  eliminate  alternative  rival  hypotheses  that  might 
explain  the  results  of  the  study.  (We  will  discuss  internal  validity  in  detail  in 
Chapter  6.)  Unlike  other  forms  of  experimental  control,  randomization 
does  not  attempt  to  eliminate  sources  of  artifact  and  bias  from  the  study. 
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Instead,  randomization  attempts 
to  control  for  the  effects  of  extra¬ 
neous  variables  by  ensuring  that 
they  are  equivalent  across  all  of  the 
experimental  and  control  groups 
in  the  study.  Randomization  can 
be  used  when  selecting  the  partici¬ 
pants  for  the  study  and  for  assign¬ 
ing  those  participants  to  various 
conditions  within  the  study.  These 
two  approaches  are  referred  to  as 
“random  selection”  and  “random 
assignment,”  respectively.  As  you 
may  recall,  the  topic  of  randomiza¬ 
tion  was  briefly  discussed  in  Chap¬ 
ter  2  in  the  context  of  choosing  study  participants  and  assigning  those 
participants  to  groups  within  the  study.  In  this  section,  we  will  discuss 
randomization  as  a  strategy  for  controlling  artifact  and  bias. 

We  will  now  discuss  how  participant  selection  and  assignment  consti¬ 
tute  the  most  effective  way  of  controlling  for  and  minimizing  the  impact 
of  sources  of  artifact  and  bias.  As  mentioned  previously,  it  is  impossible  to 
identify,  let  alone  eliminate,  all  of  the  potential  confounds  that  can  be  at 
work  within  a  research  study.  Despite  this,  researchers  can  still  attempt  to 
minimize  the  effects  of  these  confounds  by  using  random  selection  and 
random  assignment  in  participant  selection  and  assignment  procedures. 

Random  selection  is  a  control  technique  that  increases  external  validity, 
and  it  refers  to  the  process  of  selecting  participants  at  random  from  a  de¬ 
fined  population  of  interest  (Christensen,  2004;  Cochran,  1977).  We  will 
discuss  external  validity  in  detail  in  Chapter  6.  The  population  of  interest  is 
usually  defined  by  the  purpose  of  the  research  and  the  research  question 
itself.  For  example,  if  the  purpose  of  a  research  project  is  to  study  depres¬ 
sion  in  the  elderly,  then  the  population  of  interest  will  most  likely  be  elderly 
people  with  depression. 
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Randomization 

Randomization  is  a  control  method 
that  helps  to  eliminate  alternative 
rival  hypotheses  that  might  other¬ 
wise  explain  the  results  of  the 
study.  Randomization  does  not  at¬ 
tempt  to  eliminate  sources  of  arti¬ 
fact  and  bias  from  the  study.  In¬ 
stead,  it  attempts  to  control  for 
the  effects  of  extraneous  variables 
by  ensuring  that  they  are  equiva¬ 
lent  across  all  of  the  experimental 
and  control  groups  in  the  study. 
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The  research  question  might  further  define  the  population  of  interest; 
in  this  example,  the  research  question  might  be  the  following:  Does  a  new 
therapy  technique  alleviate  symptoms  of  depression  in  people  over  the  age 
of  65?  In  the  broadest  sense,  the  population  of  interest  is  therefore  people 
with  depression  who  are  at  least  65  years  old.  Ideally,  we  would  be  able  to 
draw  our  sample  of  participants  from  the  entire  population  of  elderly  in¬ 
dividuals  suffering  from  depression,  and  each  of  these  individuals  would 
have  an  equal  chance  of  being  selected  to  participate  in  the  study.  The  fact 
that  each  participant  has  an  equal  chance  of  being  selected  to  participate 
is  the  hallmark  of  random  selection. 

Random  selection  helps  control  for  extraneous  influences  because  it 
minimizes  the  impact  of  selection  biases  and  increases  the  external  valid¬ 
ity  of  the  study.  In  other  words,  using  random  selection  would  help  en¬ 
sure  that  the  sample  was  representative  of  the  population  as  a  whole.  In 
this  case,  a  sample  composed  of  randomly  selected  elderly  individuals 
with  depression  should  be  representative  of  the  population  of  all  elderly 
individuals  with  depression.  Theoretically,  the  results  we  obtain  from  a 
randomly  selected  sample  should  be  generalizable  to  all  elderly  individu¬ 
als  with  depression.  Figure  3.1  provides  a  graphic  representation  of  this 
example. 

As  you  might  suspect,  random  selection  in  its  most  general  form  is  al¬ 
most  impossible  to  accomplish.  Consider  the  resources  and  logistical  net¬ 
work  that  would  be  necessary  to  randomly  select  from  an  entire  popula¬ 
tion  of  interest.  Would  you  want  the  task  of  randomly  selecting  and 
recruiting  elderly,  depressed  individuals  from  across  the  world?  From  the 
United  States?  From  the  state  or  city  in  which  you  live?  Although  possible, 
random  selection  is  a  daunting  prospect  even  when  we  narrow  the  popu¬ 
lation  of  interest. 

For  this  reason,  researchers  tend  to  randomly  select  from  samples  of 
convenience.  A  sample  of  convenience  is  simply  a  potential  source  of  partici¬ 
pants  that  is  easily  accessible  to  the  researcher.  A  common  example  of  a 
sample  of  convenience  is  undergraduate  psychology  majors,  who  are  usu¬ 
ally  subtly  or  not  so  subtly  coerced  to  participate  in  a  wide  variety  of  re- 
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Figure  3. 1  A  graphic  example  of  random  selection. 

In  any  research  study,  the  population  of  interest  is  usually  defined  by  the  purpose  of  the  research  and 
the  research  question  itself.  In  our  current  example,  the  purpose  of  the  research  study  is  to  examine 
depression  in  the  elderly,  and  the  research  question  is  whether  a  new  therapy  technique  alleviates 
symptoms  of  depression  in  people  over  the  age  of  65. 


search  activities.  We  could  conduct  our  study  of  depression  and  the  elderly 
using  a  readily  accessible  sample  of  convenience,  rather  than  attempting  to 
sample  the  entire  population  of  depressed  elderly  individuals. 

For  example,  we  might  approach  two  or  three  local  geriatric  facilities 
and  try  to  randomly  select  participants  from  each.  In  many  instances,  the 
study  might  simply  focus  on  randomly  selecting  participants  from  one 

facility.  The  advantage  of  this  ap¬ 
proach  is  that  we  might  actually  be 
able  to  conduct  the  research  and 
gain  valuable,  albeit  limited,  infor¬ 
mation  on  treating  depression  in 
the  elderly.  The  primary  disadvan¬ 
tage  is  that  this  approach  has  a 
negative  impact  on  external  valid¬ 
ity.  The  sample  will  be  smaller  and 
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Sample  of  Convenience 

A  sample  of  convenience  is  simply  a 
potential  source  of  research  par¬ 
ticipants  that  is  easily  accessible  to 
the  researcher. 
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likely  less  representative  of  the  population  of  depressed,  elderly  individu¬ 
als,  which  can  have  a  negative  impact  on  statistical  conclusion  validity. 

As  will  be  discussed  in  Chapter  6,  the  aspect  of  quantitative  evaluation 
that  affects  the  accuracy  of  the  conclusions  drawn  from  the  results  of  a 
study  is  called  statistical  conclusion  validity.  At  its  simplest  level,  statistical 
conclusion  validity  addresses  the  question  of  whether  the  statistical  con¬ 
clusions  drawn  from  the  results  of  a  study  are  reasonable.  Although  an  ex¬ 
haustive  discussion  is  inappropriate  at  this  point,  the  results  of  certain  sta¬ 
tistical  analyses  can  be  influenced  by  sample  size.  Accordingly,  the  use  of 
an  exceptionally  small,  or  large,  sample  can  produce  misleading  results 
that  do  not  necessarily  accurately  represent  the  actual  relationship  be¬ 
tween  the  independent  and  dependent  variables. 

The  second  type  of  randomization  control  technique  is  random  assign¬ 
ment,  which  is  concerned  with  how  participants  are  assigned  to  experi¬ 
mental  and  control  conditions  within  the  research  study.  The  basic  tenet 
of  random  assignment  is  that  all  participants  have  an  equal  likelihood  of 
being  assigned  to  any  of  the  experimental  or  control  groups  (Sudman, 
1976).  The  basic  purpose  of  random  assignment  is  to  obtain  equivalence 
among  groups  across  all  potential  confounding  variables  that  might  im¬ 
pact  the  study.  Remember  that  we  can  never  eliminate  all  forms  of  artifact 
and  bias,  and  random  assignment  does  not  attempt  to  do  this.  Instead,  it 
seeks  to  distribute  or  equalize  these  potential  confounds  across  experi- 
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Random  Assignment 

Random  assignment  is  a  control  technique  in  which  all  participants  have  an 
equal  likelihood  of  being  assigned  to  any  of  the  experimental  or  control 
groups.  Random  assignment  increases  internal  validity  because  it  distrib¬ 
utes  or  equalizes  potential  confounds  across  experimental  and  control 
groups.  Studies  that  use  random  assignment  are  referred  to  as  true  experi¬ 
ments,  while  studies  that  do  not  use  random  assignment  are  referred  to 
as  quasi  experiments.  See  Chapter  5  for  a  more  detailed  discussion  of  true 
experimental  and  quasi-experimental  research  designs. 
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mental  and  control  groups.  Let’s  consider  our  study  of  depression  and  the 
elderly  to  illustrate  the  concept  of  random  assignment. 

We  manage  to  randomly  select  30  participants  from  local  geriatric  fa¬ 
cilities.  Remember  that  we  are  interested  in  the  effects  of  our  new  therapy 
on  depression.  Accordingly,  we  form  two  groups:  The  first  group  receives 
the  treatment,  while  the  other  receives  a  psychologically  inert  form  of 
intervention  that  does  not  involve  therapy.  We  have  30  participants  who 
must  now  be  randomly  assigned  to  the  two  conditions.  According  to  the 
tenets  of  random  assignment,  we  must  ensure  each  participant  has  an 
equal  probability  of  winding  up  in  either  of  the  two  groups.  This  is  usually 
accomplished  by  using  a  computer-generated  random  selection  process  or 
by  simply  referring  to  a  table  of  random  numbers.  (Contrast  this  with  a 
nonrandom  approach  to  assignment.) 

For  example,  taking  the  first  1 5  participants  and  assigning  them  to  the 
treatment  condition  and  the  last  1 5  to  the  control  condition  would  not  be 
random  assignment  because  the  participants  did  not  have  an  equal  oppor¬ 
tunity  to  be  placed  in  either  of  the  two  groups.  If  we  proceeded  this  way, 
then  we  could  be  introducing  a  selection  bias  into  the  study.  The  first  1 5 
participants  might  be  significantly  different  on  a  variety  of  factors  than  the 
second  15.  Are  the  first  1 5  more  motivated  to  participate  because  they  are 
actively  seeking  symptom  reduction?  Motivation  level  itself  might  be  a 
confounding  variable.  The  second  group  of  1 5  might  not  be  as  motivated 
to  participate  for  a  variety  of  reasons. 

Therefore,  the  results  we  obtained  might  be  affected  by  these  differences 
and  not  be  a  reflection  of  our  intervention  (the  independent  variable),  even 
if  we  found  a  positive  effect.  If  we  randomly  assigned  the  participants  to 
each  of  the  two  groups,  we  would  expect  that  the  two  groups  should  be 
equivalent  in  terms  of  participant  characteristics  and  any  other  confound¬ 
ing  variables,  such  as  motivation.  This  equivalence  is  a  researcher’s  best  de¬ 
fense  against  the  impact  of  extraneous  influences  on  the  validity  of  a  study. 
Accordingly,  random  assignment  should  be  utilized  whenever  possible  in 
the  context  of  research  design  and  methodology.  Figure  3.2  gives  a  graphic 
representation  of  random  assignment  in  our  example. 

Obviously,  random  selection  and  random  assignment — collectively  re- 
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Population  of  all 
individuals  aged  65  or 
older  suffering  from 
depression. 


Sample  of  the  population 
for  use  in  the  research 
study;  in  this  case,  a  sample 
of  convenience  from  local 
geriatric  facilities. 


30  participants 
selected 


15 

15 

participants 

participants 

control 

treatment 

Random  selection; 
Each  individual  has 
equal  chance  of  being 
chosen  for  the  study. 


Random 
assignment  to 
treatment  or 
control  group. 


Figure  3.2  A  graphic  example  of  random  assignment. 

Using  our  new  sample  of  convenience,  we  can  build  on  the  example  provided  in  Figure  3. 1  to  illustrate 
the  process  of  random  assignment.  We  manage  to  randomly  select  30  participants  from  local  geriatric 
facilities.  We  must  now  randomly  assign  them  to  either  the  therapy  group  or  the  control  group. 
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ferred  to  as  “randomization” — 
are  essential  techniques  for  mini¬ 
mizing  the  impact  of  extraneous 
variables  and  ensuring  the  validity 
of  the  conclusions  drawn  from  the 
results  of  a  research  study.  Al¬ 
though  optimal,  randomization  is 
not  the  only  approach  for  minimizing,  or  controlling  for,  the  impact  of  ex¬ 
traneous  variables.  In  our  previous  discussion,  we  highlighted  the  theoret¬ 
ical  and  logistical  difficulties  inherent  in  trying  to  achieve  true  random  se¬ 
lection  and  random  assignment.  These  realities  often  make  it  difficult,  if 
not  impossible,  to  achieve  true  randomization.  In  some  circumstances,  ran¬ 
domization  might  not  be  the  best  approach  to  use  because  the  researchers 
might  be  more  interested  in  or  concerned  with  the  impact  of  specific  ex¬ 
traneous  variables  and  confounds.  When  this  is  the  situation,  some  mea¬ 
sure  of  experimental  control  can  be  achieved  by  holding  the  influence  of 
the  variable  or  variables  in  question  constant  in  the  research  design. 


DOS'  *  T  FORGET 


Techniques  for  holding  variables 
constant,  such  as  matching  and 
blocking,  are  not  intended  to  be 
substitutes  for  true  randomization. 


HoldingVariables  Constant 

The  primary  and  most  common  method  for  holding  the  influence  of  a 
specific  variable  or  variables  constant  in  a  study  is  referred  to  as  matching. 

This  assignment  procedure  in¬ 
volves  matching  research  partici¬ 
pants  on  variables  that  may  be  re¬ 
lated  to  the  dependent  variable 
and  then  randomly  assigning  each 
member  of  the  matched  pair  to 
either  the  experimental  condition 
or  control  condition  (Beins, 
2004;  Graziano  &  Raulin,  2004). 
The  application  of  matching  is 
best  illustrated  through  example. 
Let’s  revisit  the  example  we  con- 


EOITT  FORGET 


Matching 

This  assignment  procedure  in¬ 
volves  matching  research  partici¬ 
pants  on  variables  that  may  be  re¬ 
lated  to  the  dependent  variable 
and  then  randomly  assigning  each 
member  of  the  matched  pair  to  ei¬ 
ther  the  experimental  condition  or 
the  control  condition. 
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sidered  earlier  regarding  a  new  treatment  for  depression  in  an  elderly 
population. 

In  our  previous  discussion,  we  randomly  assigned  participants  to  either 
an  experimental  or  a  control  condition.  We  will  use  the  same  basic  premise 
in  this  example,  in  which  we  are  still  interested  in  knowing  whether  our 
treatment  will  produce  greater  reduction  of  symptoms  of  depression  than 
will  receiving  an  inert  intervention  that  does  not  involve  therapy.  As  we 
previously  discussed,  we  sampled  from  the  population  in  the  same  way, 
and  still  ended  up  using  a  sample  of  convenience;  we  then  randomly  as¬ 
signed  the  participants  to  the  experimental  or  control  group. 

Now  let’s  add  another  layer  of  complexity  to  the  scenario.  We  still  want 
to  know  whether  our  new  treatment  is  effective,  but  we  might  also  be  in¬ 
terested  in  the  potential  impact  of  other  specific,  potentially  confounding 
variables.  Consider,  for  example,  that  therapeutic  outcome  can  sometimes 
be  influenced  by  intelligence.  Difficulties  with  memory  and  other  modes 
of  cognitive  functioning  might  also  significantly  impact  the  outcome  of 
therapy  when  working  with  elderly  clients. 

Given  this,  the  researchers  decide  to  control  for  the  effects  of  memory 
in  the  study.  Accordingly,  the  methodology  is  altered  to  include  a  general 
measure  of  memory  functioning  that  demonstrates  adequate  reliability 
and  validity.  In  practice,  this  assessment  would  have  to  be  given  before 
matching  or  assignment  could  occur. 

The  first  step  in  the  matching  procedure  would  be  to  create  matched 
pairs  of  participants  based  on  their  memory  screening  score.  In  this  case, 
we  have  a  two-group  design — therapy  versus  an  inert  treatment  (control 
group).  The  researchers  would  take  the  two  highest  scores  on  the  mem¬ 
ory  test  and  those  participants  would  constitute  a  matched  pair.  Next, 
this  matched  pair  would  be  split  and  each  participant  randomly  assigned 
such  that  one  member  ends  up  in  the  experimental  group  and  one  mem¬ 
ber  ends  up  in  the  control  group.  In  other  words,  each  participant  in  this 
first  matched  pair  still  has  an  equal  likelihood  of  being  assigned  to  either 
the  treatment  or  the  control  condition.  The  process  is  repeated,  so  the 
next  two  highest  scores  on  the  memory  screen  would  be  matched  and 
then  randomly  assigned  to  the  two  conditions.  The  process  would  con- 
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tinue  until  each  of  the  participants  was  assigned  to  either  one  of  the  two 
conditions. 

Note  that  matching  can  be  used  with  more  than  two  groups.  With 
three  groups,  the  three  highest  scores  would  be  randomly  assigned,  with 
four  groups  the  four  highest  scores,  and  so  on.  Similarly,  participants  can 
be  matched  on  more  than  one  variable.  In  this  case,  for  example,  we 
might  also  be  interested  in  gender  as  a  potentially  confounding  variable. 
The  researchers  could  take  the  two  highest  male  memory  scores  and  ran¬ 
domly  assign  each  participant  such  that  one  is  in  the  experimental  and  the 
other  in  the  control  group,  and  then  repeat  the  procedure  for  females 
based  on  memory  score.  Ultimately,  the  goal  is  the  same:  to  make  the 
experimental  and  control  conditions  equivalent  on  the  variables  of 
interest.  In  our  example,  the  researchers  could  safely  assume  that  the  two 
groups  had  equivalent  representation  in  terms  of  gender  and  memory 
functioning. 

Although  matching  is  one  of  the  more  common  approaches  for  hold¬ 
ing  the  influence  of  extraneous  variables  constant,  there  are  other  ap¬ 
proaches  that  can  be  used.  The  first  of  these  approaches  is  referred  to  as 
“blocking.”  Unlike  matching,  which  is  concerned  with  holding  extrane¬ 
ous  variables  constant,  blockings  an  approach  that  allows  the  researchers 

to  determine  what  specific  im¬ 
pact  the  variable  in  question  is 
having  on  the  dependent  variable 
(Christensen,  1988).  In  essence, 
blocking  takes  a  potentially  con¬ 
founding  variable  and  examines 
it  as  another  independent  vari¬ 
able. 

An  example  should  help  clarify 
how  blocking  is  actually  imple¬ 
mented  in  the  context  of  a  re¬ 
search  study.  Let’s  return  once 
again  to  our  treatment  effective - 
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Blocking 

This  assignment  technique  allows 
the  researchers  to  determine 
what  specific  impact  the  variable 
in  question  is  having  on  the  de¬ 
pendent  variable  by  taking  a  po¬ 
tentially  confounding  variable  and 
examining  it  as  another  indepen¬ 
dent  variable. 
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ness  study  for  depression  in  the  elderly.  In  the  original  design,  we  were  in¬ 
terested  in  whether  the  new  treatment  was  effective  for  reducing  symp¬ 
toms  of  depression  in  the  elderly.  There  were  two  groups — one  group  re¬ 
ceived  the  new  treatment  and  the  other  group  received  an  inert  or  control 
intervention. 

In  this  example,  the  independent  variable  is  the  new  treatment  and  the 
dependent  variable  is  the  symptom  level  of  depression.  Blocking  allows 
for  a  potentially  confounding  variable  to  become  an  independent  variable. 
We  will  use  memory  as  our  potentially  confounding  or  blocking  variable. 
In  other  words,  we  not  only  want  to  know  whether  the  treatment  is  effec¬ 
tive,  we  also  want  to  know  whether  memory  functioning  has  an  impact  on 
therapeutic  effectiveness.  Therefore,  the  researchers  might  first  divide  the 
participants  into  two  categories  based  on  memory  score.  For  instance, 
scores  below  a  certain  cutoff  number  would  constitute  the  “impaired 
memory”  group  and  scores  above  the  cutoff  number  would  constitute  the 
“adequate  memory”  group.  The  participants  would  then  be  randomly  as¬ 
signed  to  either  the  experimental  group  or  the  control  group.  Note  that 
now  there  are  two  independent  variables,  therapy  and  memory,  and  four 
groups  instead  of  two  groups  in  our  study.  In  the  original  design,  there 
were  only  two  groups,  experimental  and  control.  Now  the  researchers 
have  four  groups:  therapy/impaired  memory,  therapy/adequate  memory, 
no  therapy/impaired  memory,  and  no  therapy/ adequate  memory.  As  you 
can  see,  the  researchers  can  now  compare  the  performance  of  these 
groups  to  determine  whether  memory  had  an  effect  on  therapeutic  effec¬ 
tiveness.  Without  the  use  of  blocking,  these  additional  comparisons  would 
not  have  been  possible. 

Another  selection  approach  for  controlling  extraneous  variables  re¬ 
quires  the  researchers  to  hold  the  extraneous  variable  in  question  constant 
by  selecting  a  sample  that  is  very  uniform  or  homogeneous  on  the  variable 
of  interest.  For  example,  the  researchers  might  first  select  only  those  el¬ 
derly  individuals  with  intact  memory  functioning  for  the  therapy  study, 
most  likely  based  on  a  pretest  cutoff  score.  All  participants  who  did  not 
meet  the  cutoff  score  would  be  excluded  from  the  study.  The  participants 
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would  then  be  randomly  assigned  to  the  different  experimental  condi¬ 
tions.  The  rationale  behind  this  approach  is  relatively  straightforward. 
Specifically,  if  all  of  the  participants  are  roughly  equivalent  on  the  variable 
under  consideration  (e.g.,  memory),  then  the  potential  impact  of  the  vari¬ 
able  is  consistent  across  all  of  the  groups  and  cannot  operate  as  a  con¬ 
found.  Although  this  is  an  effective  way  of  eliminating  potential  con¬ 
founds,  it  has  a  negative  effect  on  the  generalizability  of  the  results  of  a 
study.  In  this  example,  any  results  would  pertain  only  to  elderly  individu¬ 
als  with  adequate  memory  functioning  and  not  to  a  broader  representation 
of  elderly  people  suffering  from  depression. 

Statistical  Approaches 

The  final  method  for  attaining  control  of  extraneous  variables  that  we  will 
discuss  involves  statistical  analyses  rather  than  the  selection  and  assign¬ 
ment  of  participants.  Rapid  Reference  3.7  lists  the  methods  we’ll  describe 
in  some  detail  here. 

One  statistical  approach  for  determining  equivalence  between  groups 
is  to  use  simple  analyses  of  means  and  standard  deviations  for  the  variables 
of  interest  for  each  group  in  the  study.  A  mean  is  simply  an  average  score, 
and  a  standard  deviation  is  a  measure  of  variability  indicating  the  average 

amount  that  scores  vary  from  the 
mean.  (These  concepts  will  be  dis¬ 
cussed  in  more  detail  in  Chapter 
7.)  We  could  use  means  and  stan¬ 
dard  deviations  to  obtain  a  snap¬ 
shot  of  group  scores  on  a  variable 
of  interest,  such  as  memory. 

Let’s  assume  we  randomly  as¬ 
sign  our  elderly  participants  to  our 
two  original  groups  and  that  we 
are  still  interested  in  memory 
functioning  as  a  potential  con¬ 
founding  variable.  Theoretically, 
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Statistical  Approaches  for 
Holding  Extraneous 
Variables  Constant 

•  Descriptive  statistics 

•  T-test 

•  AN  OVA 

•  ANCOVA 

•  Partial  correlation 
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random  assignment  should  make  the  two  groups  equivalent  in  terms  of 
memory  functioning.  If  we  were  cynical  (or  perhaps  obsessive- 
compulsive),  we  could  check  the  means  and  standard  deviations  for  mem¬ 
ory  scores  for  both  groups  to  see  if  they  were  consistent.  For  some  re¬ 
searchers,  eyeballing  the  results  would  be  sufficient — in  other  words,  if 
the  means  and  standard  deviations  were  close  for  both  groups,  we  would 
assume  that  there  was  no  confound.  For  others,  a  statistical  test  (/-test  for 
two  groups,  or  analysis  of  variance  [AN OVA]  for  three  or  more  groups)  to 
compare  the  means  would  be  run  to  determine  whether  there  was  a  statis¬ 
tically  significant  difference  between  the  groups  on  the  variable  of  interest 
(Howell,  1992).  If  significant  differences  were  found,  then  the  groups 
would  not  be  equivalent  on  the  variable  of  interest,  suggesting  a  possible 
confound.  This  approach  can  be  particularly  useful  when  random  assign¬ 
ment  is  not  possible  or  practical. 

There  are  two  other  statistical  approaches  that  can  be  used  to  minimize 
the  impact  of  or  to  control  for  the  influence  of  extraneous  variables.  The 
first  is  referred  to  as  “analysis  of  covariance,”  or  ANCOVA,  and  it  is  used 
during  the  data  analysis  phase  (Huitema,  1980).  This  statistical  technique 
adjusts  scores  so  that  participant  scores  are  equalized  on  the  measured 
variable  of  interest.  In  other  words,  this  statistical  technique  controls  for 
individual  differences  and  adjusts  for  those  differences  among  nonequiv¬ 
alent  groups  (see  Pedhazur  &  Schmelkin,  1991;  Winer,  1971). 

A  partial  correlation  is  another  statistical  technique  that  can  be  used 
to  control  for  extraneous  variables.  In  essence,  a  partial  correlation  is  a 
correlation  between  two  variables  after  one  or  more  variables  have 
been  mathematically  controlled  for  and  partialed  out  (Pedhazur  & 
Schmelkin,  1991).  For  example,  a  partial  correlation  would  allow  us  to 
look  at  the  relationship  between  memory  and  symptom  level  while 
mathematically  eliminating  the  impact  of  another  possibly  confounding 
variable  such  as  intelligence  or  level  of  motivation.  This  assumes,  of 
course,  that  appropriate  data  on  each  variable  have  been  collected  and 
can  be  included  in  the  analyses.  These  statistical  approaches  can  be  used 
regardless  of  whether  random  selection  and  assignment  were  employed 
in  the  study. 
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SUMMARY 

This  chapter  discussed  general  strategies  and  controls  that  can  be  used  to 
reduce  the  impact  of  artifact  and  bias  in  any  given  research  design.  These 
basic  strategies  are  particularly  useful  because  they  help  reduce  the  impact 
of  unwanted  bias  even  when  the  researcher  is  not  aware  that  bias  is  pre¬ 
sent.  The  implementation  of  these  basic  strategies  ultimately  reduces 
threats  to  validity  and  bolsters  the  confidence  that  we  can  place  in  a  study’s 
findings. 


,43*  TEST  YOURSELF 


1 .  Theoretically,  a  sample  is  most  representative  of  the  total  population 

when  random _ is  used. 

2.  Deception  can  be  used  in  any  aspect  of  the  study  as  long  as  the  benefits  of 
the  study  outweigh  the  potential  risks.  True  or  False? 

3.  The  most  effective  way  to  equalize  the  impact  of  potentially  confounding 

variables  and  ensure  the  internal  validity  of  the  study  is  through _ 


4.  Research  participants  can  assume  various  roles  that  can  influence  the  re¬ 
sults  of  a  study.  True  or  False? 

5.  Research  studies  that  are  quasi-experimental  are  preferred  over  true  ex¬ 
periments  because  they  utilize  random  assignment. True  or  False? 

Answers:  I .  selection;  2.  False  (There  are  ethical  prohibitions  against  using  deception  under 
certain  circumstances.);  3.  random  assignment;  4,True;  5.  False  (True  experiments  utilize 
random  assignment.) 
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DATA  COLLECTION,  ASSESSMENT 
METHODS,  AND  MEASUREMENT 
STRATEGIES 


The  importance  of  measurement  in  research  design  cannot  be  over¬ 
stated.  Even  the  most  well-designed  studies  will  prove  useless  if 
inappropriate  measurement  strategies  are  used  in  the  data  collec¬ 
tion  stages.  This  chapter  will  discuss  issues  related  to  data  collection  and 
measurement  strategies  in  research  design.  To  be  clear,  this  chapter  is  not 
meant  to  be  an  exhaustive  treatment  of  the  topic.  Indeed,  this  area  of  re¬ 
search  design  could  be,  and  has  been,  the  topic  of  a  number  of  in-depth 
texts  devoted  solely  to  the  subject.  Rather,  this  chapter  is  meant  to  high¬ 
light  important  concepts  related  to  measurement  and  data  collection.  We 
start  with  general  issues  related  to  the  importance  of  measurement  in  re¬ 
search  design.  Next,  we  consider  specific  scales  of  measurement  and  how 
they  are  related  to  various  statistical  approaches  and  techniques.  Finally, 
we  turn  to  psychometric  considerations  and  specific  measurement  strate¬ 
gies  for  collecting  data. 


MEASUREMENT 

Measurement  is  often  viewed  as  being  the  basis  of  all  scientific  inquiry,  and 
measurement  techniques  and  strategies  are  therefore  an  essential  compo¬ 
nent  of  research  methodology.  A  critical  juncture  between  scientific  the¬ 
ory  and  application,  measurement  can  be  defined  as  a  process  through  which 
researchers  describe,  explain,  and  predict  the  phenomena  and  constructs 
of  our  daily  existence  (Kaplan,  1964;  Pedhazur  &  Schmelkin,  1991).  For 
example,  we  measure  how  long  we  have  lived  in  years,  our  financial  suc- 
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cess  in  dollars,  and  the  distance  between  two  points  in  miles.  Important 
life  decisions  are  based  on  performance  on  standardized  tests  that  mea¬ 
sure  intelligence,  aptitude,  achievement,  or  individual  adjustment.  We 
predict  that  certain  things  will  happen  as  we  age,  become  more  educated, 
or  make  other  significant  lifestyle  changes.  In  short,  measurement  is  as  im¬ 
portant  in  our  daily  existence  as  it  is  in  the  context  of  research  design. 

The  concept  of  measurement  is  important  in  research  studies  in  two 
key  areas.  First,  measurement  enables  researchers  to  quantify  abstract 
constructs  and  variables.  As  you  may  recall  from  Chapter  2,  research  is 
usually  conducted  to  explore  the  relationship  between  independent  and 
dependent  variables.  Variables  in  a  research  study  typically  must  be  oper¬ 
ationalized  and  quantified  before  they  can  be  properly  studied  (Kerlinger, 
1992).  As  was  discussed  in  Chapter  2,  an  operational  definition  takes  a  vari¬ 
able  from  the  theoretical  or  abstract  to  the  concrete  by  defining  the  vari¬ 
able  in  the  specific  terms  of  the  actual  procedures  used  by  the  researcher 
to  measure  or  manipulate  the  variable.  For  example,  in  a  study  of  weight 
loss,  a  researcher  might  operationalize  the  variable  “weight  loss”  as  a  de¬ 
crease  in  weight  below  the  individual’s  starting  weight  on  a  particular  date. 

The  process  of  quantifying  the 
variable  would  be  relatively  simple 
in  this  situation — for  example, 
the  amount  of  weight  lost  in 
pounds  and  ounces  during  the 
course  of  the  research  study. 
Without  measurement,  re¬ 
searchers  would  be  able  to  do  little 
else  but  make  unsystematic  obser¬ 
vations  of  the  world  around  us. 

Second,  the  level  of  statistical 
sophistication  used  to  analyze 
data  derived  from  a  study  is  di¬ 
rectly  dependent  on  the  scale  of 
measurement  used  to  quantify  the 
variables  of  interest  (Anderson, 
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1961).  There  are  two  basic  cate¬ 
gories  of  data:  nonmetric  and 
metric.  Nonmetric  data  (also  re¬ 
ferred  to  as  qualitative  data)  are  typ¬ 
ically  attributes,  characteristics,  or 
categories  that  describe  an  indi¬ 
vidual  and  cannot  be  quantified. 

Metric  data  (also  referred  to  as 
quantitative  data)  exist  in  differing 
amounts  or  degrees,  and  they  re¬ 
flect  relative  quantity  or  distance.  Metric  data  allow  researchers  to  exam¬ 
ine  amounts  and  magnitudes,  while  nonmetric  data  are  used  predomi- 
nandy  as  a  method  of  describing  and  categorizing  (Hair,  Anderson, 
Tatham,  &  Black,  1995). 


DOIT  *  T  FORGET 


Nonmetric  Data  vs. 
Metric  Data 

Nonmetric  data  (which  cannot  be 
quantified)  are  predominantly 
used  to  describe  and  categorize. 
Metric  data  are  used  to  examine 
amounts  and  magnitudes. 


Scales  of  Measurement 

There  are  four  main  scales  of  measurement  subsumed  under  the  broader 
categories  of  nonmetric  and  metric  measurement:  nominal  scales,  ordinal 
scales,  interval  scales,  and  ratio  scales.  Nominal  and  ordinal  scales  are  non¬ 
metric  measurement  scales.  Nominal  scales  (see  Rapid  Reference  4.1)  are  the 

— ftap/d Reference  4./ 

Distinguishing  Characteristics  of  Nominal  Measurement 

Scales  and  Data 

•  Used  only  to  qualitatively  classify  or  categorize  not  to  quantify. 

•  No  absolute  zero  point. 

•  Cannot  be  ordered  in  a  quantitative  sequence. 

•  Impossible  to  use  to  conduct  standard  mathematical  operations. 

•  Examples  include  gender,  religious  and  political  affiliation,  and  marital 
status. 

•  Purely  descriptive  and  cannot  be  manipulated  mathematically. 
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least  sophisticated  type  of  measurement  and  are  used  only  to  qualitatively 
classify  or  categorize.  They  have  no  absolute  zero  point  and  cannot  be 
ordered  in  a  quantitative  sequence,  and  there  is  no  equal  unit  of  measure¬ 
ment  between  categories.  In  other  words,  the  numbers  assigned  to  the 
variables  have  no  mathematical  meaning  beyond  describing  the  character¬ 
istic  or  attribute  under  consideration — they  do  not  imply  amounts  of  an 
attribute  or  characteristic.  This  makes  it  impossible  to  conduct  standard 
mathematical  operations  such  as  addition,  subtraction,  division,  and  mul¬ 
tiplication.  Common  examples  of  nominal  scale  data  include  gender,  reli¬ 
gious  and  political  affiliation,  place  of  birth,  city  of  residence,  ethnicity, 
marital  status,  eye  and  hair  color,  and  employment  status.  Notice  that  each 
of  these  variables  is  purely  descriptive  and  cannot  be  manipulated  mathe¬ 
matically. 

The  second  type  of  nonmetric  measurement  scale  is  known  as  the  or¬ 
dinal  scale.  Unlike  the  nominal  scale,  ordinal  scale  measurement  (see  Rapid 
Reference  4.2)  is  characterized  by  the  ability  to  measure  a  variable  in  terms 
of  both  identity  and  magnitude.  This  makes  it  a  higher  level  of  measurement 
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Distinguishing  Characteristics  of  Ordinal  Measurement 
Scales  and  Data 

•  Build  on  nominal  measurement. 

•  Categorize  a  variable  and  its  relative  magnitude  in  relation  to  other 
variables. 

•  Represent  an  ordering  of  variables  with  some  number  representing 
more  than  another 

•  Information  about  relative  position  but  not  the  interval  between  the 
ranks  or  categories. 

•  Qualitative  in  nature. 

•  Example  would  be  finishing  position  of  runners  in  a  race. 

•  Lack  the  mathematical  properties  necessary  for  sophisticated  statistical 
analyses. 
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than  the  nominal  scale  because  the  ordinal  scale  allows  for  the  categoriza¬ 
tion  of  a  variable  and  its  relative  magnitude  in  relation  to  other  variables. 
Variables  can  be  ranked  in  relation  to  the  amount  of  the  attribute  pos¬ 
sessed.  In  simpler  terms,  ordinal  scales  represent  an  ordering  of  variables, 
with  some  number  representing  more  than  another. 

One  way  to  think  about  ordinal  data  is  by  using  the  concept  of  greater 
than  or  less  than,  which  incidentally  also  highlights  the  main  weakness  of 
ordinal  data.  Notice  that  knowing  whether  something  has  more  or  less 
of  an  attribute  does  not  quantify  how  much  more  or  less  of  the  attribute  or 
characteristic  there  is.  We  therefore  know  nothing  about  the  differences 
between  categories  or  ranks;  instead,  we  have  information  about  relative 
position,  but  not  the  interval  between  the  ranks  or  categories.  Like  nomi¬ 
nal  data,  ordinal  data  are  qualitative  in  nature  and  do  not  possess  the  math¬ 
ematical  properties  necessary  for  sophisticated  statistical  analyses.  A  com¬ 
mon  example  of  an  ordinal  scale  is  the  finishing  positions  of  runners  in  a 
race.  We  know  that  the  first  runner  to  cross  the  line  did  better  than  the 
fourth,  but  we  do  not  know  how  much  better.  We  would  know  how  much 
better  only  if  we  knew  the  time  it  took  each  runner  to  complete  the  race. 
This  requires  a  different  level  or  scale  of  measurement,  which  leads  us  to 
a  discussion  of  the  two  metric  scales  of  measurement. 

Interval  and  ratio  scales  are  the  two  types  of  metric  measurement  scales, 
and  are  quantitative  in  nature.  Collectively,  they  represent  the  most  so¬ 
phisticated  level  of  measurement  and  lend  themselves  well  to  sophisti¬ 
cated  and  powerful  statistical  techniques.  The  interval  scale  (see  Rapid  Ref¬ 
erence  4.3)  of  measurement  builds  on  ordinal  measurement  by  providing 
information  about  both  order  and  distance  between  values  of  variables. 
The  numbers  on  an  interval  scale  are  scaled  at  equal  distances,  but  there  is 
no  absolute  zero  point.  Instead,  the  zero  point  is  arbitrary.  Because  of  this, 
addition  and  subtraction  are  possible  with  this  level  of  measurement,  but 
the  lack  of  an  absolute  zero  point  makes  division  and  multiplication  im¬ 
possible.  It  is  perhaps  best  to  think  of  the  interval  scale  as  related  to  our 
traditional  number  system,  but  without  a  zero.  On  either  the  Fahrenheit  or 
Celsius  scale,  zero  does  not  represent  a  complete  absence  of  temperature, 
yet  the  quantitative  or  measurement  difference  between  10  and  20  degrees 
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Distinguishing  Characteristics  of  Interval  Measurement 
Scales  and  Data 


•  Quantitative  in  nature. 

•  Build  on  ordinal  measurement. 

•  Provide  information  about  both  order  and  distance  between  values  of 
variables. 

•  Numbers  scaled  at  equal  distances. 

•  No  absolute  zero  point; zero  point  is  arbitrary. 

•  Addition  and  subtraction  are  possible. 

•  Examples  include  temperature  measured  in  Fahrenheit  and  Celsius. 

•  Lack  of  an  absolute  zero  point  makes  division  and  multiplication  impos¬ 
sible. 
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Distinguishing 
Characteristics  of  Ratio 
Measurement  Scales 
and  Data 

•  Identical  to  the  interval  scale, 
except  that  they  have  an  ab¬ 
solute  zero  point. 

•  Unlike  with  interval  scale  data, 
all  mathematical  operations  are 
possible. 

•  Examples  include  height,  weight, 
and  time. 

•  Highest  level  of  measurement. 

•  Allow  forthe  use  of  sophisti¬ 
cated  statistical  techniques. 


is  the  same  as  the  difference  be¬ 
tween  40  and  50  degrees.  There 
might  be  a  qualitative  difference 
between  the  two  temperature 
ranges,  but  the  quantitative  differ¬ 
ence  is  identical — 10  units  or  de¬ 
grees. 

The  second  type  of  metric 
measurement  scale  is  the  ratio  scale 
of  measurement  (see  Rapid  Refer¬ 
ence  4.4).  The  properties  of  the 
ratio  scale  are  identical  to  those  of 
the  interval  scale,  except  that  the 
ratio  scale  has  an  absolute  zero 
point,  which  means  that  all  math¬ 
ematical  operations  are  possible. 
Numerous  examples  of  ratio  scale 
data  exist  in  our  daily  lives.  Money 
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is  a  pertinent  example.  It  is  possible  to  have  no  (or  zero)  money — a  zero 
balance  in  a  checking  account,  for  example.  This  is  an  example  of  an  ab¬ 
solute  zero  point.  Unlike  with  interval  scale  data,  multiplication  and  divi¬ 
sion  are  now  possible.  Ten  dollars  is  1 0  times  more  than  1  dollar,  and  20 
dollars  is  twice  as  much  as  10  dollars.  If  we  have  100  dollars  and  give  away 
half,  we  are  left  with  50  dollars,  which  is  50  times  more  than  1  dollar.  Other 
examples  include  height,  weight,  and  time.  Ratio  data  is  the  highest  level 
of  measurement  and  allows  for  the  use  of  sophisticated  statistical  tech¬ 
niques. 


PSYCHOMETRIC  CONSIDERATIONS 

A  Note  on  Measurement  and  Operational  Definitions 

The  assessment  instruments  and  methods  used  in  all  forms  of  research 
should  meet  certain  minimum  psychometric  requirements.  As  we  will  dis¬ 
cuss  later  in  this  chapter,  there  is  a  wide  variety  of  measurement  strategies 
and  techniques  that  are  common  in  research  design.  As  with  considera¬ 
tions  in  research  design,  the  research  question  and  the  constructs  under 
study  usually  drive  the  choice  of  measurement  technique  or  strategy.  More 
specifically,  the  researcher  is  usually  concerned  with  operationalizing  and 
quantifying  the  independent  and  dependent  variables  through  some  type 
of  measurement  strategy.  For  example,  depression  can  be  operationalized 
through  measurement  by  using  the  score  from  a  standardized  instrument. 
Similarly,  a  score  on  a  personality  trait  measure  might  be  used  to  opera¬ 
tionalize  a  particular  personality  trait.  Recall  from  Chapter  2  that  an  oper¬ 
ational  definition  is  simply  the  definition  of  a  variable  in  terms  of  the 
actual  procedures  used  to  measure  or  manipulate  it  (Graziano  &  Raulin, 
2004) .  Given  this  definition,  it  is  easy  to  see  that  operational  definitions  are 
essential  in  research  because  they  help  to  quantify  abstract  concepts.  Op¬ 
erationalization  can  be  easily  accomplished  through  measurement. 

For  example,  a  researcher  studying  a  new  treatment  for  depression 
would  be  interested  in  operationalizing  what  depression  is  and  how  it  is 
measured,  or  quantified.  Although  this  might  seem  self-evident  at  first, 
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consider  all  of  the  potential  ways  that  depression  could  be  operationalized 
and  measured.  Is  it  a  score  on  an  instrument  designed  to  measure  depres¬ 
sion?  Is  it  the  presence  or  absence  of  certain  symptoms  as  determined 
through  a  structured  clinical  interview?  Could  it  be  based  on  behavioral 
observations  of  activity  level?  This  merely  scratches  the  surface  of  the  pos¬ 
sible  operational  definitions  of  a  single  variable.  Let’s  stay  with  the  same 
example  and  consider  how  we  would  measure  improvement  in  level  of  de¬ 
pression.  After  all,  if  we  are  interested  in  a  new  treatment  for  depression, 
we  will  have  to  see  whether  our  participants  improve,  remain  the  same,  or 
deteriorate  after  receiving  the  intervention.  So,  how  should  we  quantify 
improvement?  Depending  on  the  operational  definition,  improvement 
could  be  determined  by  observing  reduced  scores  on  a  depression  assess¬ 
ment,  reduced  symptoms  on  a  diagnostic  interview,  observations  of  in¬ 
creased  activity  level,  or  perhaps  observations  of  two  or  all  of  these  in¬ 
dices. 

Ultimately,  the  choice  lies  with  the  researcher,  the  nature  of  the  research 
question  to  be  answered,  the  availability  of  resources,  and  the  availability 
of  measurement  techniques  and  strategies  for  the  construct  of  interest.  In 
any  event,  the  accuracy  and  quality  of  the  data  collected  from  the  study  are 
directly  dependent  on  the  measurement  procedures  and  related  opera¬ 
tional  definitions  used  to  define  and  measure  the  constructs  of  interest. 
Regardless  of  the  approach  used,  measurement  approaches  and  instru¬ 
ments  should  meet  certain  minimum  psychometric  requirements  that  help 
ensure  the  accuracy  and  relevance  of  the  measurement  strategies  used  in  a 
study.  Reliability  and  validity  are  the  most  common  and  important  psy¬ 
chometric  concepts  related  to  assessment-instrument  selection  and  other 
measurement  strategies. 

Reliability  and  Validity  and  Their  Relationship  to  Measurement 

At  its  most  general  level,  reliability  (see  Rapid  Reference  4.5)  refers  to  the 
consistency  or  dependability  of  a  measurement  technique  (Andrich,  1981; 
Leary,  2004).  More  specifically,  reliability  is  concerned  with  the  consis¬ 
tency  or  stability  of  the  score  obtained  from  a  measure  or  assessment  tech- 
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Measurement  of  Reliability 

Reliability  refers  to  the  consistency  or  dependability  of  a  measurement 
technique,  and  it  is  concerned  with  the  consistency  or  stability  of  the 
score  obtained  from  a  measure  or  assessment  overtime  and  across  set¬ 
tings  or  conditions.  If  the  measurement  is  reliable,  then  there  is  less 
chance  that  the  obtained  score  is  due  to  random  factors  and  measure¬ 
ment  error. 

So,  how  do  we  know  if  a  measurement  method  or  instrument  is  reliable? 
In  its  simplest  form,  reliability  is  concerned  with  the  relationship  between 
independently  derived  sets  of  scores,  such  as  the  scores  on  an  assessment 
instrument  on  two  separate  occasions.  Accordingly,  reliability  is  usually  ex¬ 
pressed  as  a  correlation  coefficient,  which  is  a  statistical  analysis  that  tells 
us  something  about  the  relationship  between  two  sets  of  scores  or  vari¬ 
ables.  Adequate  reliability  exists  when  the  correlation  coefficient  is  .80  or 
higher. 


nique  over  time  and  across  settings  or  conditions  (Anastasi  &  Urbina, 
1997;  White  &  Saltz,  1957).  If  the  measurement  is  reliable,  then  there  is 
less  chance  that  the  obtained  score  is  due  to  random  factors  and  measure¬ 
ment  error.  Measurement  error  is  uncontrolled  for  variance  that  distorts 
scores  and  observations  so  that  they  no  longer  accurately  represent  the 
construct  in  question.  Scores  obtained  from  most  forms  of  data  collection 
are  subject  to  measurement  error.  Essentially,  this  means  that  any  score 
obtained  consists  of  two  components.  The  first  component  is  the  true  score, 
which  is  the  score  that  would  have  been  obtained  if  the  measurement  strat¬ 
egy  were  perfect  and  error  free.  The  second  component  is  measurement  er¬ 
ror,  which  is  the  portion  of  the  score  that  is  due  to  distortion  and  impreci¬ 
sion  from  a  wide  variety  of  potential  factors,  such  as  a  poorly  designed  test, 
situational  factors,  and  mistakes  in  the  recording  of  data  (Leary,  2004). 

Although  all  measures  contain  error,  the  more  reliable  the  method  or 
instrument,  the  less  likely  it  is  that  these  influences  will  affect  the  accuracy 
of  the  measurement  (see  Rapid  Reference  4.6).  Let’s  consider  an  example. 
In  psychology,  personality  is  a  construct  that  is  thought  to  be  relatively 
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Strategies  for  Increasing  Reliability  and  Minimizing 
Measurement  Error 

There  are  numerous  practical  approaches  that  can  be  used  alone  or  in 
combination  to  minimize  the  impact  of  measurement  errorThese  sugges¬ 
tions  should  be  considered  during  the  design  phase  of  the  study  and 
should  focus  on  data  collection  and  measurement  strategies  used  to  mea¬ 
sure  the  independent  and  dependent  variables.  First,  the  administration  of 
the  instrument  or  measurement  strategy  should  be  standardized — all 
measurement  should  occur  in  the  most  consistent  manner  possible.  In 
other  words,  the  administration  of  measurement  strategies  should  be 
consistent  across  all  of  the  participants  taking  part  in  the  study.  Second, 
the  researchers  should  make  certain  that  the  participants  understand  the 
instructions  and  content  of  the  instrument  or  measurement  strategy.  If 
participants  have  difficulty  understanding  the  purpose  or  directions  of  the 
measure,  they  might  not  answer  in  an  accurate  fashion,  which  has  the  po¬ 
tential  to  bias  the  data.Third,  every  researcher  involved  in  data  collection 
should  be  thoroughly  trained  in  the  use  of  the  measurement  strategy. 
There  should  also  be  ample  opportunity  for  practice  before  the  study  be¬ 
gins  and  repeated  training  over  the  course  of  the  study  to  maintain  con¬ 
sistency.  Finally,  every  effort  should  be  made  to  ensure  that  data  are 
recorded,  compiled,  and  analyzed  accurately.  Data  entry  should  be  closely 
monitored  and  audits  should  be  conducted  on  a  regular  basis  (Leary, 
2004). 


stable.  If  we  were  to  assess  a  person’s  personality  traits  using  an  objective, 
standardized  instrument,  we  would  not  expect  the  results  to  change  sig- 
nificandy  if  we  administered  the  same  instrument  a  week  later.  If  the  re¬ 
sults  did  vary  considerably,  we  might  wonder  whether  the  instrument  that 
we  used  was  reliable  (see  Rapid  Reference  4.7).  Notice  that  we  chose  this 
example  because  personality  is  a  relatively  stable  construct  that  we  would 
not  expect  to  change  drastically  over  time.  Keep  in  mind  that  some  con¬ 
structs  and  phenomena,  such  as  emotional  states,  can  vary  considerably 
with  time.  We  would  expect  reliability  to  be  high  when  measuring  a  stable 
construct,  but  not  when  measuring  a  transient  one. 
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Assessing  Reliability 

Reliability  can  be  determined  through  a  variety  of  methods: 

•  Test-retest  reliability  refers  to  the  stability  of  test  scores  overtime 
and  involves  repeating  the  same  test  on  at  least  one  other  occasion.  For 
example,  administering  the  same  measure  of  academic  achievement  on 
two  separate  occasions  6  months  apart  is  an  example  of  this  type  of 
reliability.The  interval  of  time  between  administrations  should  be  con¬ 
sidered  with  this  form  of  reliability  because  test-retest  correlations  tend 
to  decrease  as  the  time  interval  increases. 

•  Split-half  reliability  refers  to  the  administration  of  a  single  test  that 
is  divided  into  two  equal  halves.  For  example,  a  60-question  aptitude 
test  that  purports  to  measure  one  aspect  of  academic  achievement 
could  be  broken  down  into  two  separate  but  equal  tests  of  30  items 
each. Theoretically,  the  items  on  both  forms  measure  the  same  con- 
struct.This  approach  is  much  less  susceptible  to  time-interval  effects 
because  all  of  the  items  are  administered  at  the  same  time  and  then 
split  into  separate  item  pools  afterward. 

•  Alternate-form  reliability  is  expressed  as  the  correlation  between 
different  forms  of  the  same  measure  where  the  items  on  each  measure 
represent  the  same  item  content  and  construct.This  approach  requires 
two  different  forms  of  the  same  instrument,  which  are  then  adminis¬ 
tered  at  different  times. The  two  forms  must  cover  identical  content 
and  have  a  similar  difficulty  level.The  two  test  scores  are  then  corre¬ 
lated. 

•  Interrater  reliability  is  used  to  determine  the  agreement  between 
different  judges  or  raters  when  they  are  observing  or  evaluating  the 
performance  of  others.  For  example,  assume  you  have  two  evaluators 
assessing  the  acting-out  behavior  of  a  child. You  operationalize  “acting- 
out  behavior”  as  the  number  of  times  that  the  child  refuses  to  do  his  or 
her  schoolwork  in  class. The  extent  to  which  the  evaluators  agree  on 
whether  or  when  the  behavior  occurs  reflects  this  type  of  reliability. 
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Although  reliability  is  a  neces¬ 
sary  and  essential  consideration 
when  selecting  an  instrument  or 
measurement  approach,  it  is  not 
sufficient  in  and  of  itself.  Validity 
is  another  critical  aspect  of  mea¬ 
surement  that  must  be  considered 
as  part  of  an  overall  measurement 
strategy.  Whereas  reliability  refers 
to  the  consistency  of  the  measure, 
validity  focuses  on  what  the  test  or 
measurement  strategy  measures 
and  how  well  it  does  so  (Anastasi  &  Urbina,  1997).  Therefore,  the  con¬ 
ceptual  question  that  validity  seeks  to  answer  is  the  following:  “Does  the 
instrument  or  measurement  approach  measure  what  it  is  supposed  to 
measure?”  If  so,  then  the  instrument  or  measurement  approach  is  said  to 
be  valid  because  it  accurately  assesses  and  represents  the  construct  of 
interest. 

Validity  and  reliability  are  interconnected  concepts  (Sullivan  &  Feld¬ 
man,  1979).  This  can  be  demonstrated  by  the  fact  that  a  measurement  can¬ 
not  be  valid  unless  it  is  reliable.  Remember  that  validity  is  concerned  not 
only  with  what  is  being  measured,  but  also  how  well  it  is  being  measured. 
Think  of  it  this  way:  If  you  have  a  test  that  is  not  reliable,  how  can  it  accu¬ 
rately  measure  the  construct  of  interest?  Reliability,  or  consistency,  is 
therefore  a  hallmark  of  validity.  Note,  however,  that  a  measurement  strat¬ 
egy  can  be  reliable  without  being  valid.  The  measurement  strategy  might 
provide  consistent  scores  over  time,  but  that  does  not  necessarily  mean  it 
is  accurately  measuring  the  construct  of  interest. 

Consider  an  example  in  which  you  choose  to  use  in  your  study  an  in¬ 
strument  that  purports  to  measure  depression.  It  produces  reliable  scores 
as  evidenced  by  a  high  test-retest  reliability  coefficient.  In  other  words, 
there  is  a  high  positive  correlation  between  the  pretest  and  posttest  scores 
on  the  same  measure.  On  further  inspection,  however,  you  notice  that  the 
content  of  the  instrument  is  more  closely  related  to  anxiety.  You  are  mea- 
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Validity 

The  concept  of  validity  refers  to 
what  the  test  or  measurement 
strategy  measures  and  how  well  it 
does  so.  Conceptually,  validity 
seeks  to  answer  the  following 
question:  “Does  the  instrument  or 
measurement  approach  measure 
what  it  is  supposed  to  measure?” 
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suring  something  reliably,  but  at  this  point  it  might  not  be  depression.  In 
other  words,  the  instrument,  though  reliable,  might  not  be  a  valid  measure 
of  depression;  instead,  it  might  be  a  valid  measure  of  anxiety. 

As  we  discussed  earlier  in  this  chapter,  the  accurate  measurement  of  the 
constructs  and  variables  in  a  study  is  a  critical  component  of  research.  The 
most  well-designed  study  is  meaningless  and  a  waste  of  time  and  resources 
if  the  independent  and  dependent  variables  cannot  be  identified,  concep¬ 
tualized,  operationalized,  and  quantified.  The  validity  of  measurement  ap¬ 
proaches  is  therefore  a  critical  aspect  of  the  overall  design  process.  How, 
then,  is  the  validity  of  a  measurement  strategy  established?  Like  reliability, 
validity  is  determined  by  considering  the  relationship,  either  quantitatively 
or  qualitatively,  between  the  test  or  measurement  strategy  and  some  ex¬ 
ternal,  independent  event  (Groth-Marnat,  2003).  The  most  common 
methods  for  demonstrating  validity  are  referred  to  as  content-related,  cri¬ 
terion-related,  and  construct-related  validity  (Campbell,  1960). 

Content-related  validity  refers  to  the  relevance  of  the  instrument  or  mea¬ 
surement  strategy  to  the  construct  being  measured  (Fitzpatrick,  1983). 
Put  simply,  the  measurement  approach  must  be  related  to  the  construct 
being  measured.  Although  this  concept  is  usually  applied  to  the  develop¬ 
ment  and  critique  of  psychological  and  other  forms  of  tests,  it  is  also  ap¬ 
plicable  to  most  forms  of  measurement  strategies  used  in  research. 

The  approach  for  determining  content  validity  starts  with  the  opera¬ 
tionalization  of  the  construct  of  interest.  The  test  developer  defines  the 
construct  and  then  attempts  to  develop  item  content  that  will  accurately 
capture  it.  For  example,  an  instrument  designed  to  measure  anxiety  should 
contain  item  content  that  reflects 
the  construct  of  anxiety.  If  the 
content  does  not  accurately  re¬ 
flect  the  construct,  then  chances 
are  that  there  is  little  or  no  content 
validity. 

Content  validity  can  also  be  re¬ 
lated  to  other  types  of  measure¬ 
ment  strategies  used  in  research 
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Content  Validity 

Content-related  validity  refers  to 
the  relevance  of  the  instrument  or 
measurement  strategy  to  the  con¬ 
struct  being  measured. 
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design  and  methodology.  A  significant  amount  of  research,  especially  in 
psychology,  is  conducted  using  preexisting,  commercially  available  instru¬ 
ments  (see  Rapid  Reference  4.8) .  However,  a  researcher  might  be  interested 
in  studying  a  variable  that  cannot  be  measured  with  an  existing  instrument 
or  test — or  perhaps  the  use  of  commercially  available  instruments  might 
be  cost  prohibitive.  This  is  a  relatively  common  situation  that  should  not 
bring  the  study  to  a  grinding  halt.  Most  forms  of  research  do  not  require 
the  use  of  preexisting  or  expensive  measurement  strategies.  It  is  not  un- 

— ftap/d Reference  4.S 

Commercially  Available  Instruments  and 
Measurement  Strategies 

A  huge  number  of  measurement  instruments  are  commercially  available 
to  researchers.They  are  particularly  abundant  in  the  areas  of  psychologi¬ 
cal  and  educational  research.  Researchers  must  be  careful  to  consider  a 
number  of  factors  when  deciding  on  whether  an  existing  test  is  appropri¬ 
ate  for  data  collection  in  a  research  study.  A  consideration  of  the  psycho¬ 
metric  properties  (validity  and  reliability)  is  always  an  essential  first  step. 
Interested  readers  are  referred  to  the  latest  editions  of  the  Mental  Mea¬ 
surements  Yearbook  and  Tests  in  Print,  which  provide  psychometric  data 
and  reviews  for  a  wide  variety  of  measurement  materials  (Impara  &  Plake, 

1 998;  Murphy,  Impara,  &  Plake,  1 999).  What  follows  is  a  nonexhaustive  list 
of  other  factors  that  should  be  considered  when  evaluating  a  test: 

•  Reliability 

•  Validity 

•  Cost 

•  Time  needed  to  administer 

•  Reading  level 

•  Test  length 

•  Theoretical  soundness 

•  Norms 

•  Standardized  administration  procedure 

•  Well-documented  manual 
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usual  for  researchers  to  develop  their  own  measures  or  measurement 
strategies.  This  is  a  legitimate  approach  to  data  collection  as  long  as  the 
measure  or  strategy  accurately  captures  the  construct  of  interest. 

Consider  the  following  example.  A  researcher  is  interested  in  studying 
aggression  in  young  children.  The  researcher  consults  the  literature  only 
to  find  that  there  is  no  preexisting  measure  for  quantifying  aggression  for 
the  age  group  under  consideration.  Rather  than  abandoning  the  project, 
the  researcher  decides  to  create  a  measure  to  capture  the  behavior  of 
interest.  First,  “aggression”  must  be  operationalized.  In  this  case,  our  re¬ 
searcher  is  interested  in  studying  physical  aggression,  so  the  researcher  de¬ 
cides  to  operationalize  aggression  as  the  number  of  times  a  child  strikes 
another  child  during  a  certain  period  of  time.  A  checklist  of  items  related 
to  this  type  of  aggression  is  then  developed.  The  researcher  observes  chil¬ 
dren  in  a  variety  of  settings  and  records  the  frequency  of  aggressive  be¬ 
havior  and  the  circumstances  surrounding  each  event.  Although  there  are 
no  psychometric  data  available  for  this  approach,  it  is  apparent  that  the 
measurement  strategy  has  content  validity.  The  items  and  the  approach 
clearly  measure  the  construct  of  aggression  in  young  children  as  opera¬ 
tionalized  by  the  researcher. 

Another  effective  approach  to 
determining  the  validity  of  an  in¬ 
strument  or  measurement  strat¬ 
egy  is  examining  the  criterion 
validity  of  the  instrument  or 
measurement  strategy.  Criterion  va¬ 
lidity  is  determined  by  the  relation¬ 
ship  between  the  measure  and 
performance  on  an  outside  crite¬ 
rion  or  measure.  The  outside  cri¬ 
terion  or  measure  should  be  re¬ 
lated  to  the  construct  of  interest, 
and  it  can  be  measured  at  the  same 
time  the  measure  is  given  or  some- 


DOITT  FORGET 


Criterion  Validity 

Criterion  validity  is  is  determined  by 
the  relationship  between  a  mea¬ 
sure  and  performance  on  an  out¬ 
side  criterion  or  measure.  Concur¬ 
rent  criterion  validity  refers  to  the 
relationship  between  measures 
taken  at  the  same  time.  Predictive 
criterion  validity  refers  to  the  rela¬ 
tionship  between  measures  that 
are  taken  at  different  times. 
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time  in  the  future.  If  the  measure  is  compared  to  an  outside  criterion  that 
is  measured  at  the  same  time,  it  is  then  referred  to  as  concurrent  validity.  If 
the  measure  is  compared  to  an  outside  criterion  that  will  be  measured  in 
the  future,  it  is  then  referred  to  as  predictive  validity. 

Again,  an  example  may  help  clarify  this  concept.  Let’s  assume  that  a  re¬ 
searcher  is  using  an  instrument  or  has  developed  another  measurement 
strategy  to  capture  the  construct  of  depression.  There  are  a  number  of 
ways  that  criterion  validity  could  be  determined  in  this  case.  The  measure 
would  have  concurrent  criterion  validity  if  the  measure  indicated  depres¬ 
sion  and  the  participant  met  diagnostic  criteria  for  depression  at  the  same 
time.  When  both  suggest  the  presence  of  depression,  then  we  have  the  be¬ 
ginnings  of  criterion  validity.  The  measure  would  have  predictive  criterion 
validity  if  the  measure  indicated  depression  and  the  participant  met  diag¬ 
nostic  criteria  for  depression  at  some  point  in  time  in  the  future. 

The  final  concept  that  we  will  discuss  with  respect  to  demonstrating  the 
validity  of  an  instrument  or  measurement  strategy  is  construct  validity. 
Construct  validity  assesses  the  extent  to  which  the  test  or  measurement  strat¬ 
egy  measures  a  theoretical  construct  or  trait  (Groth-Marnat,  2003).  Al¬ 
though  there  are  numerous  approaches  for  determining  construct  validity, 
we  will  focus  on  the  two  most  common  methods:  convergent  and  diver¬ 
gent  validity  (Bechtold,  1959;  Campbell  &  Fiske,  1959).  Again,  these  con¬ 
cepts  are  best  illustrated  through  an  example.  The  first  approach  is  to  ex¬ 
plore  the  relationship  between  the  measure  of  interest  and  another 
measure  that  purportedly  captures  the  same  construct  (i.e.,  convergent  valid¬ 
ity).  Consider  our  depression  example.  If  the  instrument  or  strategy  we 
were  using  in  our  depression  study  were  accurately  capturing  the  construct 
of  depression,  we  would  expect  that  there  would  be  a  strong  relationship 
between  the  measurement  in  question  and  other  measures  of  depression. 
This  relationship  would  be  expressed  as  the  correlation  between  the  two 
approaches,  or  a  correlation  coefficient.  A  strong  positive  correlation  between 
the  two  measures  would  suggest  construct  validity.  Construct  validity  can 
also  be  demonstrated  by  showing  that  two  constructs  are  unrelated  (i.e.,  di¬ 
vergent  validity).  For  example,  we  would  not  expect  our  measure  of  depres¬ 
sion  to  have  a  strong  positive  correlation  with  a  measure  of  happiness.  In 
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this  case,  construct  validity  would 
be  expressed  as  a  strong  negative 
correlation  because  we  would  ex¬ 
pect  the  two  constructs  of  happi¬ 
ness  and  depression  to  be  in¬ 
versely  related — the  happier  you 
are,  the  less  likely  it  is  that  you  are 
suffering  from  depression. 

MEASUREMENT 
STRATEGIES  FOR 
DATA  COLLECTION 


DOIT  *  T  FORGET 


Construct  Validity 

Construct  validity  assesses  the  ex¬ 
tent  to  which  the  test  or  measure¬ 
ment  strategy  measures  a  theoret¬ 
ical  construct  or  trait.There  is  a 
variety  of  approaches  for  deter¬ 
mining  construct  validity.These  ap¬ 
proaches  focus  on  the  extent  to 
which  the  measurement  of  a  cer¬ 
tain  construct  converges  or  di¬ 
verges  with  the  measurement  of 
similar  or  different  constructs. 


So  far,  we  have  considered  various 

basic  issues  related  to  measurement.  We  have  highlighted  the  importance 
of  scales  of  measurement  and  how  they  can  guide  data  collection.  Our  dis¬ 
cussion  of  psychometrics  pointed  out  the  importance  of  considering  reli¬ 
ability  and  validity  when  choosing  a  measurement  instrument  or  approach 
to  quantify  the  independent  and  dependent  variables  under  consideration. 
These  are  important  considerations,  but  this  chapter  would  not  be  com¬ 
plete  without  a  discussion  of  some  of  the  different  methods  and  ap¬ 
proaches  used  for  collecting  the  data  for  the  constructs  of  interest.  Re¬ 
member  that  the  constructs  of  interest  in  any  research  study  tend  to  be 
defined  in  terms  of  independent  and  dependent  variables. 

So,  how  do  we  measure  our  independent  and  dependent  variables? 
They  are,  after  all,  the  focus  of  any  study.  The  number  of  available  mea¬ 
surement  strategies  is  staggering,  and  is  sometimes  limited  only  by  the 
researcher’s  imagination  and  choice  of  research  question.  The  choice  of 
strategy  also  tends  to  vary  by  research  question  and  research  design,  which 
is  why  it  is  difficult  to  account  for  every  type  of  measurement  approach. 
Despite  this,  the  choice  of  measurement  strategy  is  usually  driven  by  a  va¬ 
riety  of  factors  that  progress  from  general  to  specific. 

The  broadest  consideration  is  always  the  nature  of  the  research  ques¬ 
tion  and  the  independent  and  dependent  variables.  In  other  words,  the 
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researcher  decides  how  best  to  measure  the  independent  and  dependent 
variables  with  the  ultimate  goal  being  to  answer  the  research  question.  Ad¬ 
dressing  this  broad  and  all-important  choice  requires  the  consideration  of 
more  specific  factors. 

For  example,  our  earlier  discussion  highlighted  the  importance  of  scales 
of  measurement.  At  what  level  should  we  try  to  measure  our  variables, 
knowing  that  this  decision  can  affect  our  ability  to  employ  certain  statisti¬ 
cal  techniques  during  the  data  analysis  stage?  At  this  point,  the  thought 
might  come  to  mind  that  all  the  researcher  has  to  do  is  find  a  way  to  mea¬ 
sure  the  variables  of  interest  at  the  interval  or  ratio  level  of  measurement. 
Although  this  might  allow  for  the  use  of  preferred  statistical  techniques,  it 
is  not  always  possible  or  even  desirable  to  measure  variables  at  the  interval 
and  ratio  levels  because  not  all  variables  lend  themselves  to  these  levels  of 
measurement.  Take  a  moment  to  think  about  all  of  the  interesting  and  crit¬ 
ically  important  variables  that  are  measured  by  the  nominal  or  ordinal 
scales  of  measurement.  Gender,  race,  ethnicity,  religious  affiliation,  em¬ 
ployment  status,  and  political  party  affiliation  are  all  examples  of  nominal 
or  ordinal  data  that  are  common  in  many  forms  of  social  science  research. 

Another  factor  might  be  related  to  the  psychometric  properties  of  the 
measurement  strategy.  Although  reliability  and  validity  are  usually  consid¬ 
ered  primarily  in  the  context  of  psychological  tests  and  other  instruments, 
the  concepts  are  important  to  consider  in  all  types  of  measurement.  The 
fact  that  you  are  not  using  a  psychological  test  or  other  psychometrically 
validated  instrument  does  not  mean  that  reliability  and  validity  are  no 
longer  important  considerations.  Regardless  of  what  you  are  measuring 
and  how  you  do  so,  that  measurement  approach  should  measure  what  it 
purports  to  measure  and  do  so  in  a  consistent  fashion. 

For  psychological  and  other  tests,  a  related  issue  is  whether  the  instru¬ 
ment  is  appropriate  for  the  population  the  researcher  is  studying.  For  ex¬ 
ample,  consider  a  case  in  which  a  researcher  wants  to  use  an  established, 
commercially  available  instrument  to  assess  levels  of  depression  in  the  el¬ 
derly.  The  researcher  would  have  to  make  certain  that  the  test  developers 
considered  and  captured  this  population  when  developing  the  instrument. 
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If  they  did  not,  then  it  would  be  inappropriate  to  use  the  instrument  to 
study  depression  in  this  population. 

Availability  is  another  important  consideration  when  selecting  a  mea¬ 
surement  strategy.  What  approaches,  if  any,  already  exist  for  measuring  the 
construct  of  interest?  One  might  want  to  consider  established  forms  of 
measurement,  such  as  psychometrically  based  tests.  Instruments  of  this 
type  can  be  researched  by  consulting  the  most  recent  version  of  the  Men¬ 
tal  Measurements  Yearbook.  For  example,  there  is  a  wide  variety  of  psycho¬ 
metrically  sound  instruments  available  for  the  measurement  of  depression 
and  personality.  Another  approach  might  be  to  review  related  research  to 
see  how  others  have  measured  the  construct  or  similar  constructs.  The  lit¬ 
erature  might  suggest  what  instrument  has  been  used  most  often  to  mea¬ 
sure  the  construct  of  interest  with  the  same  population  that  you  are  inter¬ 
ested  in.  Or,  if  there  is  no  instrument  available,  it  might  suggest  an 
appropriate  strategy  for  capturing  the  construct.  For  example,  previously 
conducted  research  might  provide  a  framework  for  designing  a  unique  as¬ 
sessment  strategy  for  quantifying  specific  behavioral  problems  with  young 
children.  Note  that  original  research  questions  might  require  the  develop¬ 
ment  of  unique  and  specialized  assessment  instruments  and  strategies. 

Cost  is  another  consideration.  Funding  tends  to  vary  from  study  to 
study.  Some  studies  are  well  funded,  while  others  are  conducted  with  litde 
or  no  funding.  Those  of  you  who  conducted  dissertation  research  with  ac¬ 
tual  participants  probably  have  some  experience  with  the  little-or-no- 
funding  category.  One  of  the  primary  drawbacks  of  using  commercially 
available  instruments  is  that  they  can  be  costly,  hence  the  expression 
“commercially”  available.  There  is  considerable  variation  in  the  cost  asso¬ 
ciated  with  various  instruments.  Some  are  very  reasonable  and  others  are 
cost  prohibitive.  The  cost  consideration  is  partially  dependent  on  how 
many  participants  are  in  the  study.  The  more  participants  to  be  measured 
on  some  construct,  the  higher  the  cost.  In  studies  for  which  money  is  a  se¬ 
rious  consideration,  the  use  of  some  commercially  available  instruments 
might  be  prohibitive.  This  might  require  the  researcher  to  develop  or  cre¬ 
ate  a  measure  or  assessment  strategy  to  capture  the  constructs  of  interest. 
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Although  this  is  relatively  common,  there  are  some  potential  problems 
that  arise  from  creating  a  new  measure  or  measurement  strategy.  The  first 
concern  is  that  new  instruments  and  strategies  might  have  questionable 
reliability  and  validity.  It  cannot  be  assumed  that  the  instrument  or  strat¬ 
egy  is  reliable  or  valid.  At  a  minimum,  the  researcher  will  have  to  take  steps 
to  demonstrate  the  reliability  and  validity  of  the  measurement  approach. 
After  all,  you  have  to  measure  variables  in  a  reliable  and  valid  fashion  be¬ 
fore  you  can  make  any  statements  about  the  relationship  between  them, 
regardless  of  what  statistical  analysis  might  suggest. 

Another  issue  regarding  unique  measurement  approaches  and  instru¬ 
ments  relates  to  the  existing  body  of  scientific  literature  in  a  given  topic 
area.  There  are  certain  instruments  and  approaches  that  tend  to  appear  in 
the  scientific  literature  for  the  study  of  given  topics.  For  example,  there  are 
a  number  of  common  measures  of  personality  and  depression  that  appear 
consistently  in  the  research  literature.  Studies  using  these  instruments  can 
add  to  an  existing  body  of  literature.  Conversely,  studies  using  obscure  or 
unique  instruments  and  approaches,  although  valuable  in  and  of  them¬ 
selves,  might  not  be  as  relevant  to  that  body  of  literature  because  the  mea¬ 
surement  strategies  are  not  consistent  and  therefore  not  directly  compa¬ 
rable. 

Training  is  another  factor  to  consider  when  selecting  a  measurement  in¬ 
strument  or  strategy.  Training  is  important  for  two  reasons:  The  first  re¬ 
lates  to  the  training  of  the  researcher  and  is  usually  related  primarily  to  the 
use  of  commercially  available  psychological  and  related  tests.  Many  test 
providers  have  minimum  user  requirements.  In  our  case,  that  would  mean 
that  the  researcher  must  meet  certain  educational  and/ or  training  require¬ 
ments  before  the  company  will  permit  the  use  of  the  instrument  in  the 
study.  Although  the  requirements  vary  by  test,  the  typical  user  must  have 
an  advanced  degree  in  the  social  sciences  or  education,  and/or  have  spe¬ 
cific  training  in  psychometrics.  In  some  instances,  test  developers  will  al¬ 
low  the  use  of  these  instruments  by  less-qualified  individuals  if  they  attend 
a  training  seminar  that  provides  a  certification  in  the  proper  use  of  the  in¬ 
strument. 

The  second  reason  relates  to  training  in  a  broad  sense.  The  use  of  mea- 
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surement  instruments  and  strategies,  whether  commercially  available  or 
not,  requires  a  theoretical  foundation  related  to  the  construct  of  interest. 
For  example,  a  researcher  measuring  some  aspect  of  personality  should  be 
familiar  with  personality  theory  and  the  theoretical  approach  adopted  by 
the  instrument  or  strategy  in  question.  Similarly,  a  researcher  interested  in 
evaluating  the  effectiveness  of  a  behavioral  modification  system  for  chil¬ 
dren  should  be  familiar  with  the  theoretical  underpinnings  and  practical 
application  of  concepts  related  to  behavior  modification  before  designing 
the  measurement  strategy.  Remember  that  all  validation  begins  after  a 
concept  has  been  given  an  accurate  operational  definition  that  reflects  the 
construct  of  interest.  Appropriate  training  assists  in  this  process  and  is  the 
first  step  in  addressing  the  validity  of  the  measurement  strategy  or  instru¬ 
ment. 

The  time  needed  to  conduct  the  measurement  and  the  ease  of  its  use  are 
the  last  two  factors  that  we  will  consider.  Researchers  should  let  the  con¬ 
cept  of  parsimony  guide  them  here.  Generally, parsimony  refers  to  selecting 
the  simplest  explanation  for  a  phenomenon  when  there  are  competing 
explanations  available  (Kazdin,  2003c).  The  key  concept  here  is  simplicity. 
Researchers  should  attempt  to  measure  the  variables  of  interest  as  effi- 
ciendy  and  accurately  as  possible.  Remember  the  importance  of  reliability 
and  validity.  Depending  on  the  construct,  a  longer  and  more  complicated 
assessment  will  not  necessarily  provide  a  more  accurate  measurement 
than  a  strategy  that  is  less  complicated  and  takes  half  the  time.  In  addition, 
the  likelihood  of  mistakes,  fatigue,  or  inattention  among  both  researchers 
and  participants  might  become  more  prevalent  as  the  measurement  strat¬ 
egy  becomes  more  time  intensive  and  complicated.  This,  in  turn,  could  af¬ 
fect  the  accuracy  of  the  data.  In  short,  avoid  unnecessarily  long  and  com¬ 
plicated  assessment  procedures  whenever  possible. 

METHODS  OF  DATA  COLLECTION 

With  these  factors  in  mind,  we  will  now  discuss  some  of  the  more  com¬ 
mon  approaches  to  data  collection  and  measurement  in  research.  Again, 
there  are  many  different  approaches  to  data  collection,  and  this  discussion 
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is  not  intended  to  be  exhaustive  of  the  subject  matter.  Despite  this,  there 
are  certain  broad  categories  that  encompass  the  more  common  types  of 
data  collection  techniques.  Generally,  and  not  surprisingly,  the  research 
question  and  the  nature  of  the  variables  under  investigation  usually  drive 
the  choice  of  measurement  strategy  for  data  collection. 

We  have  mentioned  the  use  of  psychological  testing  and  other  similar 
commercially  available  instruments  throughout  this  chapter.  The  use  of 
this  type  of  testing  in  research  is  very  common,  especially  in  psychology, 
education,  and  other  social  sciences.  A  brief  survey  of  available  instru¬ 
ments  suggests  that  we  can  capture  a  wide  variety  of  factors  related  to  the 
human  experience.  For  example,  instruments  exist  that  allow  researchers 
to  measure  personality,  temperament,  adjustment,  symptom  level,  behav¬ 
ior,  career  interest,  memory,  academic  achievement  and  aptitude,  emo¬ 
tional  competence,  and  intelligence.  These  instruments  are  attractive  to 
researchers  because  they  tend  to  have  established  reliability  and  validity, 
and  they  eliminate  the  need  to  develop  and  validate  an  instrument  from 
scratch.  Many  of  these  instruments  also  produce  data  at  the  interval  and 
ratio  levels,  which  is  a  prerequisite  feature  for  certain  types  of  statistical 
analyses.  The  development  of  new  instruments  is  best  left  to  specialists 
with  extensive  training  in  psychological  testing,  psychometrics,  and  test 
development.  In  other  words,  always  consider  existing  instruments  as  data 
collection  methods  before  developing  one  of  your  own.  A  poorly  designed 
measurement  strategy  can  confound  the  results  of  even  the  best  research 
design.  Again,  let  reliability  and  validity  be  your  guides. 

Although  testing  is  common,  it  is  not  the  only  method  for  data  collec¬ 
tion  available  to  researchers.  There  are  often  times  when  it  is  necessary  to 
adopt  another  approach  to  data  collection.  As  we  discussed  earlier,  there 
are  many  reasons  that  this  might  be  the  case.  For  example,  not  all  variables 
of  interest  have  been  operationalized  in  the  form  of  standardized  tests,  or 
some  research  questions  might  require  unique  or  different  approaches. 
Cost  and  time  constraints  might  also  be  important  considerations.  In  cases 
like  these,  the  researcher  might  have  to  consider  and  adopt  other  data  col¬ 
lection  strategies.  In  many  cases,  these  strategies  are  just  as  valid  as,  and  are 
even  preferable  to,  the  use  of  formal  testing. 
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Some  of  these  alternative  ap¬ 
proaches,  as  summarized  in  Rapid 
Reference  4.9,  include  interview¬ 
ing,  global  ratings,  observation, 
and  biological  measures.  As  we 
will  see,  sometimes  the  most  effi¬ 
cient  data  collection  techniques 
are  also  the  simplest. 

A  thorough  interview  is  a  form 
of  self-report  that  is  a  relatively 
simple  approach  to  data  collec¬ 
tion.  Although  simple,  it  can  pro¬ 
duce  a  wealth  of  information.  An 
interview  can  cover  any  number 
of  content  areas  and  is  a  relatively 
inexpensive  and  efficient  way  to  collect  a  wide  variety  of  data  that  does  not 
require  formal  testing.  One  of  the  most  common  uses  of  the  interview  is 
to  collect  life-history  and  biographical  data  about  the  research  participants 
(Anastasi  &  Urbina,  1997;  Stokes,  Mumford,  &  Owens,  1994).  The  effec¬ 
tiveness  of  an  interview  depends  on  how  it  is  structured.  In  other  words, 
the  interview  should  be  thought  out  beforehand  and  standardized  so  that 
all  participants  are  asked  the  same  questions  in  the  same  order.  Similarly, 
the  researchers  conducting  the  interview  should  be  trained  in  its  proper 
administration  to  avoid  variation  in  the  collection  of  data.  Interviews  are 
a  relatively  common  way  of  collecting  data  in  research  and  the  data  they 
collect  and  the  forms  they  take  are  limited  only  by  the  requirements  of  the 
research  question  and  the  related  research  design.  One  drawback  of  using 
an  interview  procedure  is  that  the  data  obtained  may  not  be  appropriate 
for  extensive  statistical  analysis  because  they  simply  describe  a  construct 
rather  than  quantifying  it. 

Examples  of  interviews  are  not  difficult  to  identify.  Employment  inter¬ 
views  are  a  classic  example.  Although  they  are  not  typically  used  in  re¬ 
search  studies,  their  main  goal  is  to  gather  data  that  will  allow  a  company 
to  answer  the  research  question  (so  to  speak)  of  whether  someone  would 
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Main  Approaches  to 
Measurement  and  Data 
Collection  in 
Research  Methods 
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educational,  academic,  intelli¬ 
gence) 

•  Interviewing 

•  Global  ratings 

•  Observation 

•  Biological  measures 
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make  a  good  employee.  Interviews  are  also  an  essential  component  of 
most  types  of  qualitative  research,  which  is  briefly  discussed  in  Chapter  5. 
For  example,  if  we  were  interested  in  the  impact  of  childhood  trauma  on  a 
participant’s  current  functioning,  we  might  construct  an  interview  to  cap¬ 
ture  his  or  her  experiences  from  childhood  through  adulthood. 

Like  interviews,  global  ratings  are  another  form  of  self-report  that  is 
commonly  used  as  a  data  collection  technique  in  research.  Unlike  an  in¬ 
terview,  this  approach  to  measurement  attempts  to  quantify  a  construct  or 
variable  of  interest  by  asking  the  participant  to  rate  his  or  her  response  to 
a  summary  statement  on  a  numerical  continuum.  This  is  less  complex  than 
it  sounds,  and  everyone  has  been  exposed  to  this  data  collection  approach 
at  one  point  in  time  or  another.  If  a  researcher  were  interested  in  measur¬ 
ing  attitudes  toward  a  class  in  research  methods,  he  or  she  could  develop 
a  set  of  summary  statements  and  then  ask  the  participants  to  rate  their  at¬ 
titudes  along  a  bipolar  continuum.  One  statement  might  look  like  this: 


On  a  scale  of  I  to  5,  please  rate  the  extent  to  which  you  enjoy  the 
research-methods  class. 


1 

2 

3 

4 

5 

Hate  it 

Neutral 

Love  it 

In  this  example,  the  participant  would  simply  circle  the  appropriate  num¬ 
ber  that  best  reflects  his  or  her  attitude  toward  the  research-methods  class. 
The  use  of  global  ratings  is  also  common  when  asking  participants  to  rate 
emotional  states,  symptoms,  and  levels  of  distress. 

The  strength  of  global  ratings  is  that  they  can  be  adapted  for  a  wide  va¬ 
riety  of  topics  and  questions.  They  also  yield  interval  or  ratio  data.  Despite 
this,  researchers  should  be  aware  that  such  a  rating  is  only  a  global  measure 
of  a  construct  and  might  not  capture  its  complexity  or  more  subtle  nu¬ 
ances.  For  example,  the  previous  example  may  tell  us  how  much  someone 
enjoys  a  certain  research-method  class,  but  it  will  not  tell  us  why  the  per¬ 
son  either  loves  it  or  hates  it. 
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Observation  is  another  versatile  approach  to  data  collection.  This  ap¬ 
proach  relies  on  the  direct  observation  of  the  construct  of  interest,  which 
is  often  some  type  of  behavior.  In  essence,  if  you  can  observe  it,  you  can 
find  some  way  of  measuring  it.  The  use  of  this  approach  is  widespread  in 
a  variety  of  research,  educational,  and  treatment  settings. 

Let’s  consider  the  use  of  observation  in  a  research  setting.  This  ap¬ 
proach  is  an  efficient  way  to  collect  data  when  the  researcher  is  interested 
in  studying  and  quantifying  some  type  of  behavior.  For  example,  a  re¬ 
searcher  might  be  interested  in  studying  cooperative  behavior  of  young 
children  in  a  classroom  setting.  After  operationalizing  “cooperative  be¬ 
havior”  as  sharing  toys,  the  researcher  develops  a  system  for  quantifying 
the  behavior.  In  this  case,  it  might  be  as  simple  as  sitting  unobtrusively 
in  a  corner  of  the  classroom,  observing  the  behavior  of  the  children,  and 
counting  the  number  of  times  that  they  engage  in  cooperative  behavior. 
Alternatively,  if  we  were  interested  in  studying  levels  of  boredom  in  a 
research-methods  class,  we  could  simply  count  the  number  of  yawns  or 
number  of  times  that  someone  nods  off. 

As  with  other  forms  of  data  collection,  the  process  of  quantifying  ob¬ 
servations  should  be  standardized.  The  behavior  in  question  must  be  ac¬ 
curately  operationalized  and  everyone  involved  in  the  data  collection 
should  be  trained  to  ensure  accuracy  of  observation.  Proper  operational¬ 
ization  of  the  variable  and  adequate  training  should  help  ensure  adequate 
validity  and  interrater  reliability.  Videotaping  and  multiple  raters  are  fre¬ 
quently  used  to  confirm  the  accuracy  of  the  observations.  The  use  of  ob¬ 
servational  methods  usually  produces  frequency  counts  of  a  particular 
behavior  or  behaviors.  These  data  are  frequently  at  the  interval  and  ratio 
level. 

Obtaining  biological  measures  is  another  strategy  for  collecting  research 
data.  This  approach  is  common  in  medical  and  psychobiological  research. 
It  often  involves  measuring  the  physiological  responses  of  participants 
to  any  number  of  potential  stimuli.  The  most  common  examples  of  re¬ 
sponses  include  heart  rate,  respiration,  blood  pressure,  and  galvanic  skin 
response.  As  with  all  of  the  forms  of  measurement  that  we  have  discussed, 
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operationalization  and  standard¬ 
ization  are  essential.  Consider  a 
study  investigating  levels  of  anxi¬ 
ety  in  response  to  a  certain  aver¬ 
sive  stimulus.  We  could  use  any  of 
the  other  measurement  ap¬ 
proaches  to  gather  the  data  we 
need  regarding  anxiety,  but  we 
chose  instead  to  collect  biological 
data  because  it  is  very  difficult  for 
people  to  regulate  or  fake  their  re¬ 
sponses.  We  operationalize  anxi¬ 
ety  as  scores  on  certain  physiolog¬ 
ical  responses,  such  as  heart  rate 
and  respiration.  Each  participant 
is  exposed  to  the  stimulus  in  the 
exact  same  fashion  and  then  is  measured  across  the  biological  indicators 
we  chose  to  operationalize  anxiety.  The  data  obtained  from  biological 
measures  are  frequently  at  the  interval  or  ratio  level. 


DOS'  *  T  FORGET 


Multiple  Measurement 
Strategies 

Multiple  measurement  strategies 
can  be  used  in  a  research  study, 
even  if  they  are  all  used  to  mea¬ 
sure  the  same  construct  or  vari¬ 
able.  For  example,  a  psychological 
test,  an  interview,  and  a  global  rat¬ 
ing  could  all  be  used  to  measure 
the  construct  of  depression. This 
may  be  considered  an  optimal  ap¬ 
proach,  as  convergence  on  multi¬ 
ple  measures  would  increase  over¬ 
all  confidence  in  a  study’s  findings. 


SUMMARY 

This  chapter  focused  on  important  issues  and  considerations  related  to 
various  aspects  of  data  collection  and  measurement.  Measurement  strate¬ 
gies  are  an  integral  aspect  of  research  design  and  methodology  that  should 
be  considered  at  the  earliest  stages  of  design  conceptualization.  Special 
consideration  should  be  given  to  scales  of  measurement,  psychometric 
properties,  and  specific  measurement  strategies  for  collecting  data.  Ulti¬ 
mately,  measurement  is  critical  in  research  because  it  allows  researchers  to 
quantify  abstract  constructs  and  variables.  This  is  an  essential  step  in  ex¬ 
ploring  the  relationship  between  various  independent  and  dependent  vari¬ 
ables. 
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Putting  It  Into  Practice 


An  Example 

Suppose  a  researcher  is  asked  to  design  a  study  to  examine  student  atti¬ 
tudes  toward  two  different  research-methods  classes  taught  by  two  differ¬ 
ent  instructors. The  researcher  is  told  that  the  purpose  of  the  study  is  to 
determine  whetherthere  are  significant  differences  in  satisfaction  be¬ 
tween  the  two  classes. The  referral  source  cannot  provide  a  significant 
level  of  funding.The  researcher  starts  by  clarifying  the  research  question 
and  the  variables  to  be  quantified  and  studied. The  referral  source  wants 
to  quantify  whetherthere  are  significant  differences  between  the  two 
classes’  satisfaction  levels  with  regard  to  a  variety  of  class  components, 
such  as  class  size,  quality  of  the  instructor;  usefulness  of  the  textbook 
pace  of  the  class,  and  so  on. These  components  are  the  variables  of  inter- 
est.The  referral  source  wants  to  compare  the  two  classes,  which  suggests 
that  certain  parametric  statistical  tests  (e.g.,  a  t-test)  will  be  used  to  deter¬ 
mine  whetherthere  are  differences  between  the  two  classes  on  the  vari¬ 
ables  of  interest.  Accordingly,  the  researcher  decides  that  the  variables  of 
interest  should  be  measured  at  the  interval  or  ratio  level. 

The  key  question  is  what  measurement  strategy  to  use.The  researcher 
needs  a  measurement  strategy  that  allows  for  measurement  at  the  interval 
or  ratio  level.  Not  surprisingly,  a  review  of  the  Mental  MeasurementsYear- 
book  and  the  literature  reveals  that  there  are  no  existing  measures  of  stu¬ 
dent  satisfaction  toward  certain  components  of  a  research-methods  class. 
Furthermore,  an  interview  will  not  provide  interval  or  ratio  data,  and  it 
might  be  inappropriate  to  take  biological  measurements  in  this  setting  be¬ 
cause  it  would  certainly  be  cost  prohibitive  and  would  disrupt  the  flow  of 
the  classes.  Behavioral  observation  might  allow  us  to  infer  satisfaction,  but  it 
is  not  a  direct  measure  of  the  variables  we  have  been  asked  to  assess.  Re¬ 
member  that  what  is  being  measured  is  satisfaction  with  a  number  of  dif¬ 
ferent  course  components,  and  not  just  general  satisfaction  with  the  class. 
The  researcher  decides  to  use  global  ratings.  Questions  are  designed  to 
capture  the  variables  of  interest  and  the  students  will  be  asked  to  respond 
on  a  scale  from  I  to  5,  with  5  suggesting  extreme  satisfaction  and  I  sug¬ 
gesting  extreme  dissatisfaction. This  approach  is  cost  effective  and  will  pro¬ 
vide  data  at  the  interval  level  (because  there  is  no  absolute  zero  on  the 
scale),  which  will  allow  for  the  use  of  the  preferred  parametric  statistical 
technique.  Wanting  to  be  thorough,  the  researcher  includes  an  open- 

( continued ) 
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ended  question  (an  interview  question)  with  each  global  rating  so  that  the 
students  can  elaborate  on  their  numerical  rating  with  narrative  material. 
Although  this  type  of  information  does  not  lend  itself  to  statistical  analysis, 
it  should  provide  more  specifics  as  to  why  the  students  are  satisfied  or 
dissatisfied  with  various  class  components. The  data  are  collected  and  ana¬ 
lyzed,  and  the  results,  perhaps  not  surprisingly,  suggest  that  everyone  is 
dissatisfied  with  everything  about  research  methods! 


.49*  TEST  YOURSELF 


1 .  _ is  often  defined  as  a  process  through  which  researchers  de¬ 

scribe,  explain,  and  predict  the  phenomena  and  constructs  of  our  daily 
existence. 

2.  _ data  constitute  the  highest  level  of  measurement  and  allow  for 

the  use  of  sophisticated  statistical  techniques. 

3.  _ ,  or  qualitative,  data  are  the  attributes,  characteristics,  or  cate¬ 

gories  that  describe  an  individual  and  are  used  predominantly  as  a  method 

of  describing  and  categorizing. _ ,  or  quantitative,  data  refer  to 

differing  amounts  or  degrees  of  an  attribute,  and  these  data  reflect  rela¬ 
tive  quantity  or  distance. 

4.  A  measurement  can  be  valid,  but  not  reliable. True  or  False? 

5.  _ and _ are  two  important  psychometric  considera¬ 

tions  when  selecting  psychological  and  other  tests. 

Answers:  I .  Measurement;  2.  Ratio;  3.  Nonmetric,  Metric;  4,  False  (A  measure  must  be  reli¬ 
able  to  be  valid.);  5.  Reliability,  validity 
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Five 


GENERAL  TYPES  OF  RESEARCH 
DESIGNS  AND  APPROACHES 


Once  the  researcher  has  determined  the  specific  question  to  be 
answered  and  has  operationalized  the  variables  and  research 
question  into  a  clear,  measurable  hypothesis,  it  is  time  to  con¬ 
sider  a  suitable  research  design.  Although  there  are  endless  ways  of  classi- 
fying  research  designs,  they  usually  fall  into  one  of  three  general  cate¬ 
gories:  experimental,  quasi-experimental,  and  nonexperimental.  This 
classification  system  is  based  primarily  on  the  strength  of  the  design’s  ex¬ 
perimental  control.  To  determine  the  classification  of  a  particular  research 
design,  it  is  helpful  to  ask  several  key  questions.  First,  does  the  design  in¬ 
volve  random  assignment  to  different  conditions?  If  random  assignment 
is  used,  it  is  considered  to  be  a  randomized,  or  true,  experimental  design. 
If  random  assignment  is  not  used,  then  a  second  question  must  be  asked: 
Does  the  design  use  either  multiple  groups  or  multiple  waves  of  measure¬ 
ment?  If  the  answer  is  yes,  the  design  is  considered  quasi-experimental.  If 
the  answer  is  no,  the  design  would  be  considered  nonexperimental  (see 
Trochim,  2001). 

Although  each  of  the  three  types  of  research  designs  can  provide  use¬ 
ful  information,  they  differ  gready  in  the  degree  to  which  they  enable  re¬ 
searchers  to  draw  confident  causal  inferences  from  a  study’s  findings  (as 
discussed  in  Chapter  1).  In  this  chapter,  we  will  review  each  of  the  three 
classes  of  research  design,  the  ways  that  each  type  of  research  design  are 
applied,  and  the  overall  strengths  and  weaknesses  of  each  type  of  research 
design. 


123 
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EXPERIMENTAL  DESIGNS 


A  true  experimental  design  is  one  in  which  study  participants  are  ran¬ 
domly  assigned  to  experimental  and  control  groups.  We  have  discussed 
randomization  in  previous  chapters,  so  this  chapter  will  simply  highlight 
the  importance  of  randomization  in  terms  of  the  strength  of  a  research  de¬ 
sign.  Although  randomization  is  typically  described  using  examples  such 
as  rolling  dice,  flipping  a  coin,  or  picking  a  number  out  of  a  hat,  most  stud¬ 
ies  now  rely  on  the  use  of  random  numbers  tables  to  help  them  assign  their 
research  participants  (as  discussed  in  Chapters  2  and  3). 

A  random  numbers  table  is  nothing  more  than  a  random  list  of  numbers 
displayed  or  printed  in  a  series  of  columns  and  rows.  Typically,  computer 
programs  that  generate  such  lists  allow  you  to  request  a  specific  quantity 
and  range  of  numbers  to  be  generated.  To  use  a  random  numbers  table  to 
assign  study  participants  to  groups,  you  must  first  determine  the  exact 
numbers  that  you  will  use  to  determine  the  assignments.  For  example,  if 
you  have  three  groups  or  conditions,  you  may  use  the  numbers  1,2,  and  3. 
Alternatively,  if  you  were  assigning  participants  to  two  groups,  you  could 
use  the  numbers  1  and  2,  or  simply  odd  or  even  numbers,  to  determine  the 
group  assignments.  The  important  point  is  that  you  define  the  assignment 
criteria  ahead  of  time,  so  that  your  selections  are  not  biased  and  remain 
purely  random. 

After  selecting  your  assignment 


DON'T  FORGET 


Random  Numbers  Table 

A  random  numbers  table  is  nothing 
more  than  a  random  list  of  num¬ 
bers  displayed  or  printed  in  a  se¬ 
ries  of  columns  and  rows.  Using  a 
random  numbers  table  is  one  ef¬ 
fective  way  to  randomly  assign 
participants  to  groups  within  a  re¬ 
search  study. 


criteria,  you  must  randomly  iden¬ 
tify  a  starting  place  in  the  random 
numbers  table.  This  is  usually 
done  by  either  selecting  a  starting 
place  on  the  table  before  begin¬ 
ning  (e.g.,  top  right  of  third  col¬ 
umn)  or  simply  closing  your  eyes 
and  randomly  pointing  to  a  loca¬ 
tion  on  the  table,  which  will  serve 
as  the  starting  point.  Once  you 
have  selected  a  starting  point,  you 
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will  simply  move  through  the  list  (either  down  the  columns  or  across  the 
rows)  and  identify  each  instance  that  numbers  in  your  selected  range  ap¬ 
pear  until  you  have  group  assignments  for  your  entire  sample  of  partici¬ 
pants. 

To  illustrate,  assume  that  you  are  planning  to  assign  1 00  participants  to 
one  of  four  different  groups.  You  begin  by  defining  the  numbers  1,  2,  3, 
and  4  as  the  criteria  for  your  group  assignments.  You  then  randomly  point 
to  a  spot  on  the  table  from  which  to  begin,  and  go  down  the  columns  of 
numbers  one  by  one  listing  each  appearance  of  1 , 2,  3,  or  4,  while  skipping 
all  other  numbers.  Once  you  have  listed  100  numbers,  you  will  be  done. 
The  first  number  that  you  listed  will  determine  the  first  participant’s  as¬ 
signment,  the  second  number  will  determine  the  second  participant’s  as¬ 
signment,  and  so  forth.  For  example,  using  the  table  below,  assume  that  we 
begin  with  number  0480  in  the  top  row,  left-most  column  of  the  table.  If 
we  worked  our  way  down  the  columns,  from  left  to  right,  listing  appear¬ 
ances  of  1, 2, 3,  or  4  (in  bold  type)  in  the  last  digit  of  each  number,  we  would 
wind  up  with  the  following  series  of  assignments:  2,  4,  1,  1,  3,  3,  2,  3, 1,  3, 
4, 1,3, 1,4,2. 


0480 

5011 

1536 

2011 

1647 

9174 

2362 

6573 

5595 

5393 

0995 

9198 

4134 

8360 

2527 

7265 

6393 

4809 

2167 

3093 

6243 

1684 

7856 

6376 

7570 

9975 

1837 

6656 

6121 

1782 

7921 

6902 

1008 

2751 

7756 

3498 

Although  the  standard  randomization  procedure  will  ensure  random¬ 
ized  groups,  it  will  not  necessarily  result  in  groups  of  equal  size.  To  obtain 
randomized  groups  of  equal  sizes,  you  could  use  a  block  randomisation  pro¬ 
cedure.  This  procedure  is  carried  out  in  the  same  manner  as  discussed,  ex¬ 
cept  that  participants  are  grouped  into  blocks.  Each  block  will  consist  of 
one  assignment  to  each  of  the  study  groups.  Therefore,  the  number  of  par¬ 
ticipants  per  block  is  the  same  as  the  number  of  groups  in  the  study.  Us- 
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ing  the  prior  example,  you  would  proceed  down  the  columns  listing  each 
appearance  of  1, 2, 3,  or  4  only  once  until  the  first  block  is  full,  before  mov¬ 
ing  to  the  second  block  of  four  assignments,  and  so  forth,  until  you  have 
assigned  1 00  participants  into  a  total  of  25  blocks  of  four.  Regardless  of 
the  technique  used  to  randomly  assign  participants  to  groups  within  a 
study,  random  assignment  increases  the  likelihood  that  changes  in  the  de¬ 
pendent  variable  are  attributable  to  the  independent  variables  rather  than 
to  extraneous  factors  or  nuisance  variables. 

For  example,  a  researcher  examining  the  effectiveness  of  a  certain  treat¬ 
ment  will  want  to  be  confident  that  the  experimental  group  (the  group 
receiving  the  new  treatment)  does  not  differ  from  the  control  group  (the 
group  receiving  an  alternative  or  placebo  intervention)  at  the  start  of  the 
study.  Otherwise,  the  researcher  will  be  unable  to  confidently  attribute  any 
between-group  differences  that  appear  at  the  end  of  the  study  to  the  treat¬ 
ment  rather  than  to  some  preexisting  differences.  Although  the  researcher 
could  attempt  to  make  the  groups  more  comparable  by  matching  the  two 
groups  on  any  number  of  variables,  it  would  ultimately  be  impossible  to 
make  the  groups  identical.  There  are  simply  too  many  (perhaps  an  infinite 
number  of)  other  individual  differences  that  remain  uncontrolled  for  and 
that  may  influence  the  study’s  outcome. 

For  example,  the  researcher  may  carefully  match  the  two  groups  on 
characteristics  such  as  age,  gender,  race,  and  socioeconomic  status  with 
the  belief  that  these  variables  may  have  an  impact  on  treatment  outcomes. 
Although  this  procedure  may  make  the  groups  more  similar,  the  groups 
may  still  differ  on  other  potentially  important  yet  unmeasured  variables, 
such  as  level  of  intelligence,  degree  of  motivation,  or  prior  treatment  ex¬ 
periences.  The  fact  that  the  groups  may  differ  on  some  unknown  and  un¬ 
measured  variable  substantially  reduces  the  researcher’s  ability  to  attribute 
changes  in  the  dependent  variable  to  the  independent  variable  and  to  draw 
valid  causal  conclusions  from  the  data.  Randomization,  however,  tends  to 
distribute  individual  differences  equally  across  groups  so  that  the  groups 
differ  systematically  in  only  one  way:  the  intervention  being  examined  in 
the  study. 

It  is  primarily  for  this  reason  that  in  most  instances,  when  feasible,  the 

term  LinG  -  live,  informative.  Non-cost  and  Genuine  \ 


GENERAL  TYPES  OF  RESEARCH  DESIGNS  AND  APPROACHES  I  27 


randomized  experimental  design  is  the  preferred  method  of  research.  Put 
simply,  it  provides  the  highest  degree  of  control  over  a  research  study,  and 
it  allows  the  researcher  to  draw  causal  inferences  with  the  highest  degree 
of  confidence.  In  general,  randomized  or  true  experiments  can  be  con¬ 
ducted  using  one  of  three  main  designs:  (1)  a  randomized  two-group 
posttest  only  or  pretest-posttest  design,  (2)  a  Solomon  four-group  design, 
or  (3)  a  factorial  design.  The  following  notation  will  be  used  to  describe  the 
different  designs: 

X  =  experimental  manipulation  (independent  variable);  sub¬ 
scripts  identify  different  levels  or  groups  of  the  independent 
variable  (e.g.,  X1S  X2 ,  X3  is  used  to  denote  either  a  no¬ 
intervention  or  alternative-intervention  control  group) 

Y  =  experimental  manipulation  (independent  variable)  other 
than  X 

O  =  observation 

R  =  indication  that  participants  have  been  randomly  assigned 
NR  =  indication  that  participants  have  not  been  randomly  assigned 


Randomized  Two-Group  Design 

In  their  simplest  form,  true  experiments  are  composed  of  two  groups  or 
two  levels  of  an  independent  variable.  Of  course,  as  discussed  in  Chapter 
2,  these  designs  could  incorporate  any  number  of  levels  of  an  independent 
variable  and  could  thus  consist  of  three,  four,  or  any  other  number  of 
groups.  The  primary  purpose  of  this  design  is  to  demonstrate  causality — 
that  is,  to  determine  whether  a  specific  intervention  (the  independent  vari¬ 
able)  causes  an  effect  (as  opposed  to  being  merely  correlated  with  an  ef¬ 
fect). 

For  example,  a  researcher  studying  smoking  cessation  may  randomly 
assign  identified  cigarette  smokers  either  to  a  novel  medication  (experi¬ 
mental)  group  or  to  a  comparison  (control)  group.  There  are  several  dif¬ 
ferent  types  of  control  or  comparison  groups  that  can  be  used  in  this  type 
of  design.  The  type  of  comparison  group  that  is  used  largely  depends  on 
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the  specifics  of  the  research  hypothesis  and  the  factors  that  the  researcher 
wishes  to  control.  For  example,  if  the  researcher  wishes  to  examine 
whether  the  intervention  is  more  effective  than  no  treatment  at  all,  the  re¬ 
searcher  may  choose  to  use  some  form  of  placebo  control  group.  The 
placebo  control  condition  may  involve  a  seemingly  useful  intervention, 
but  one  that  has  no  demonstrable  effects  (e.g.,  a  sugar  pill).  This  would 
control  for  effects  that  may  occur  in  the  experimental  groups  as  a  result 
of  experimenter  attention  or  other  forms  of  bias.  Alternatively,  if  the  re¬ 
searcher  wants  to  know  whether  the  intervention  is  superior  to  a  standard 
treatment,  the  researcher  would  choose  the  standard  intervention  as  the 
comparison  group.  There  are  two  basic  types  of  randomized  two-group 
designs:  the  posttest  only  and  the  pretest-posttest  design. 

Randomised  Two-Group  Posttest  Only  Design 

In  its  most  basic  form,  the  two-group  experimental  design  may  involve 
little  more  than  random  assignment  and  a  posttest,  as  depicted  here: 

R— X—  O 

R— X2— O 

Because  individual  characteristics  are  assumed  to  be  equally  distributed 
through  randomization,  there  is  theoretically  no  real  need  for  a  pretest  to 
assess  the  comparability  of  the  groups  prior  to  the  intervention.  In  this  de¬ 
sign,  random  assignment  ensures,  to  some  degree,  that  the  two  groups  are 
equivalent  before  treatment  so  that  any  posttreatment  differences  can  be 
attributed  to  the  treatment.  This  simple  design  encompasses  all  the  neces¬ 
sary  elements  of  a  true  randomized  experiment:  (1)  random  assignment, 
to  distribute  extraneous  differences  across  groups;  (2)  intervention  and 
control  groups,  to  determine  whether  the  treatment  had  an  effect;  and  (3) 
observations  following  the  treatment. 

Randomised  Two-Group  Pretest-Posttest  Design 

Despite  the  relative  simplicity  of  the  posttest  only  approach,  most  ran¬ 
domized  experiments  typically  utilize  the  pretest-posttest  design,  which  is 
depicted  here: 
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R— O— X  —  O 
R— O— X2— O 

The  addition  of  a  pretest  has  several  important  benefits.  First,  it  allows  the 
researcher  to  compare  the  groups  on  several  measures  following  random¬ 
ization  to  determine  whether  the  groups  are  truly  equivalent.  Although  it 
is  likely  that  randomization  distributed  most  differences  equally  across  the 
groups,  it  is  possible  that  some  differences  still  exist.  This  process  of  mea¬ 
suring  the  integrity  of  random  assignment  is  typically  referred  to  as  a  ran¬ 
domisation  check  (see  Rapid  Reference  5.1).  Researchers  can  often  statisti¬ 
cally  control  for  such  preintervention  differences  if  they  are  found. 

The  second  major  benefit  of  a  pretest  is  that  it  provides  baseline  infor¬ 
mation  that  allows  researchers  to  compare  the  participants  who  com¬ 
pleted  the  posttest  to  those  who  did  not.  Accordingly,  researchers  can  de¬ 
termine  whether  any  between-group  differences  found  at  the  end  of  the 
study  are  due  to  the  intervention  or  merely  to  differential  attrition  of 


Rapid  Reference  S.  / 


Randomization  Checks 

The  randomization  check,  as  its  name  suggests,  is  the  process  of  examining 
the  overall  effectiveness  of  random  assignment.The  goal  of  this  process  is 
to  determine  whether  random  assignment  resulted  in  nonequivalent 
groups.  In  performing  randomization  checks,  researchers  compare  study 
groups  or  conditions  on  a  number  of  pretest  variables.These  typically  in¬ 
clude  demographic  variables  such  as  age,  gender,  level  of  education,  and 
any  other  variables  that  are  measured  or  available  prior  to  the  interven¬ 
tion.  Importantly,  randomization  checks  should  look  for  between-group 
differences  on  the  baseline  measures  of  the  dependent  variables  because 
they  are  likely  to  have  the  most  impact  on  outcomes.  Generally,  random¬ 
ization  checks  involve  the  use  of  statistical  analyses  that  can  examine  dif¬ 
ferences  between  groups  (as  will  be  discussed  in  Chapter  7).  If  differences 
are  found  on  certain  variables,  the  researcher  should  determine  whether 
they  are  correlated  with  the  outcomes.  Any  such  variables  that  are  corre¬ 
lated  with  outcomes  should  be  controlled  for  in  the  final  analyses. 
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participants  across  groups.  Attrition  is  the  loss  of  participants  during  the 
course  of  the  study.  This  process  is  typically  referred  to  as  an  attrition  anal¬ 
ysis  (see  Rapid  Reference  5.2). 

For  example,  consider  a  study  in  which  we  compare  outpatient  treat¬ 
ment  to  inpatient  treatment  for  depression.  After  examining  the  posttest 
data,  we  conclude  that  outpatient  treatment  produced  greater  reductions 
in  depression  than  the  inpatient  treatment.  Although  random  assignment 
may  have  ensured  that  all  participant  differences  were  distributed  equally 
at  baseline,  it  did  not  ensure  that  all  groups  would  be  the  same  at  follow¬ 
up.  Therefore,  it  is  possible  that  certain  participants  were  more  likely  to 
drop  out  of  one  group  than  the  other,  resulting  in  differential  attrition.  In 
this  example,  clients  with  higher  levels  of  depression  may  have  been  more 
likely  to  drop  out  of  the  outpatient  treatment,  which  would  explain  the  rel¬ 
ative  success  of  outpatient  over  inpatient  treatment. 

Inevitably,  a  certain  proportion  of  study  participants  will  not  make  it  to 
follow-up.  Often  referred  to  as  mortality,  attrition  can  have  many  negative 
effects  on  the  validity  of  a  research  study.  First,  it  may  substantially  dimin¬ 
ish  the  size  of  an  experimental  sample,  which  could  reduce  the  study’s 
statistical  power  and  its  ability  to  identify  group  differences  if  they  exist. 
Second,  because  participants  who  drop  out  are  likely  to  be  different  from 
those  who  complete,  attrition  may  substantially  limit  the  overall  generaliz- 


^  Rap/d Reference  S.2 


Attrition  and  Attrition  Analysis 

Attrition  analysis  is  a  method  of  examining  the  overall  impact  of  research 
attrition  on  the  makeup  of  a  study  sample  and  the  validity  of  a  study’s 
findings. The  goal  of  this  procedure  is  to  identify  any  differences  between 
those  participants  who  complete  the  study  and  those  who  do  not  com¬ 
plete  the  study. To  conduct  this  type  of  analysis,  researchers  compare 
completers  versus  noncompleters  on  a  number  of  pretest  variables. These 
may  include  demographic  and  any  other  variables  that  are  measured  or 
available  on  participants  priorto  the  intervention.  Generally,  this  process 
involves  the  use  of  several  statistical  analyses. 
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ability  of  a  study’s  findings.  Third,  and  perhaps  most  important,  attrition 
from  research  is  generally  not  randomly  distributed  (Cook  &  Campbell, 
1979)  and  appears  to  be  systematically  influenced  by  the  participant  char¬ 
acteristics,  the  nature  of  research  interventions,  the  type  of  follow-up 
methods  employed,  and  many  other  variables.  This  can  contribute  to 
highly  systematic  differences  in  attrition  rates  between  research  condi¬ 
tions.  Unfortunately,  such  differential  attrition  cannot  be  confidendy  con¬ 
trolled  for  by  random  selection,  random  assignment,  or  any  other  experi¬ 
mental  research  method  (Cook  &  Campbell,  1963).  As  a  practical  matter, 
when  attrition  occurs,  it  can  never  be  definitively  established  whether  be- 
tween-group  differences  in  a  particular  study  were  caused  by  the  experi¬ 
mental  intervention(s)  or  by  differential  attrition  across  conditions 
(Campbell  &  Stanley,  1963;  Cook  &  Campbell,  1963). 

One  obvious  disadvantage  of  the  pretest-posttest  design  is  that  the  use 
of  a  pretest  may  ultimately  make  participants  aware  of  the  purpose  of  the 
study  and  influence  their  posttest  results.  If  the  pretest  influences  the 
posttests  of  both  the  experimental  and  control  groups,  it  becomes  a  threat 
to  the  external  validity  or  generalizability  of  a  study’s  findings.  This  is  be¬ 
cause  the  posttest  will  no  longer  reflect  how  participants  would  respond  if 
they  had  not  received  a  pretest.  Alternatively,  if  the  pretest  influences  the 
posttests  of  only  one  of  the  groups,  it  poses  a  threat  to  the  internal  valid¬ 
ity  of  a  study.  We  discuss  internal  validity  in  detail  in  Chapter  6. 

Despite  this  drawback,  the  two-group  experimental  design  may  be  seen 
as  the  gold  standard  in  determining  whether  a  new  procedure  (or  inde¬ 
pendent  variable)  causes  an  effect.  Researchers  often  employ  this  design 
in  the  early  stages  of  an  intervention’s  empirical  validation.  At  these  initial 
stages,  the  researcher’s  primary  aim  may  simply  be  to  examine  the  effec¬ 
tiveness  of  the  intervention.  This  can  be  done  easily  and  relatively  inex¬ 
pensively  by  comparing  the  treatment  to  just  one  other  group  (typically  a 
standard  intervention  or  a  placebo  control).  If  the  study’s  findings  suggest 
that  the  treatment  is  effective,  the  researcher  may  want  to  test  more- 
specific  hypotheses  regarding  the  treatment,  such  as  isolating  its  effective 
components  by  dismantling  the  intervention  (see  Rapid  Reference  5.3), 
examining  its  effectiveness  with  other  populations,  comparing  it  with 
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— flap/d Reference  SJ 


Dismantling  Studies 

The  term  dismantling,  as  used  in  the  research  context,  refers  to  studies 
aimed  at  isolating  the  effective  components  of  an  intervention.  In  studying 
specific  interventions,  researchers  often  begin  by  examining  the  effective¬ 
ness  of  the  overall  model.  However,  once  the  model  is  found  to  be  effec¬ 
tive,  the  research  community  will  often  want  to  know  why  it  is  effective. 
To  answer  this  question,  researchers  may  begin  dismantling  the  interven¬ 
tion.  Dismantling  can  be  done  in  a  variety  of  ways,  but  typically  involves  a 
series  of  studies  that  compare  an  intervention  with  and  without  certain 
components. 


other  types  of  treatment,  or  examining  it  in  combination  with  other  inter¬ 
ventions.  Testing  these  hypotheses  may  require  the  use  of  other,  perhaps 
more  sophisticated  experimental  designs. 


Solomon  Four-Group  Design 

It  is  perhaps  easiest  to  understand  the  Solomon  four-group  design  if  we 
think  of  it  as  a  combination  of  the  randomized  posttest  only  and  pretest- 
posttest  two-group  designs,  as  depicted  below. 

R— O— X— O 

R—  O— X—  O 

R - X— O 

R - X2— O 

The  principal  advantage  of  this  design  is  that  it  controls  for  the  potential  ef¬ 
fects  of  the  pretest  on  posttest  outcomes.  This  design  allows  the  researcher 
to  determine  whether  posttest  differences  resulted  from  the  intervention, 
the  pretest,  or  a  combination  of  the  treatment  and  the  pretest.  This  last  pos¬ 
sibility  is  an  example  of  an  interaction,  which  will  be  discussed  shortly.  Im¬ 
portantly,  this  design  offers  the  best  features  of  both  of  the  two-group  de- 
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signs,  in  that  it  allows  the  researcher  to  examine  between-group  differences 
at  baseline,  without  the  results’  being  influenced  or  confounded  by  the 
pretest  administration.  For  this  reason,  the  Solomon  four-group  design  can 
also  be  viewed  as  a  very  basic  example  of  a  factorial  design  (discussed  in  the 
next  section),  as  it  examines  the  separate  and  combined  effects  of  more 
than  one  independent  variable  (i.e.,  the  pretest  and  the  intervention). 


Factorial  Design 

Most  outcomes  in  research  are  likely  to  have  several  causes  that  interact 
with  each  other  in  a  variety  of  ways  that  cannot  be  identified  through  the 
use  of  two-group  experimental  designs.  For  example,  as  discussed,  the 
two-group  pretest-posttest  design  might  result  in  an  undetectable  interac¬ 
tion  effect  (see  Rapid  Reference  5.4  and  Figure  5.1)  between  the  pretest 


= Rapid  Reference  S.  / 


Interaction  Effects 

An  interaction  effect  is  the  result  of  two  or  more  independent  variables 
combining  to  produce  a  result  different  from  those  produced  by  either 
independent  variable  alone.  An  interaction  effect  occurs  when  one  inde¬ 
pendent  variable  differs  across  the  levels  of  at  least  one  other  indepen¬ 
dent  variable.  Interactions  can  be  found  only  in  those  factorial  designs  that 
include  two  or  more  independent  variables.  When  reviewing  the  results 
of  a  factorial  study,  we  begin  by  determining  whether  there  are  any  signifi¬ 
cant  interactions.  If  significant  interactions  are  found,  we  can  no  longer  in¬ 
terpret  the  simple  effects  (i.e.,  between-group  differences  for  either  inde¬ 
pendent  variable  alone),  because  they  (as  a  result  of  the  interaction)  are 
determined  to  vary  across  levels  of  the  other  independent  variable(s).This 
is  illustrated  in  Figure  5. 1 ,  where  the  dose  of  a  specific  intervention  is 
found  to  interact  with  client  gender  on  the  client  success  rate. 

In  this  example,  we  cannot  interpret  the  simple  effects  of  gender  or  dose 
(on  client  success  rate)  because  they  vary  as  a  function  of  each  other  We 
can  interpret  only  the  interaction,  which  appears  to  indicate  that  males 
are  more  successful  with  lower  doses,  while  females  are  more  successful 
with  higher  doses. 
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Male  Female 


■  High  dose 

■  Low  dose 


Client  gender 

Figure  5. 1  An  example  of  an  interaction  effect. 


and  the  independent  variable,  such  that  posttest  differences,  if  found, 
could  not  be  confidendy  attributed  to  the  independent  variable.  The 
Solomon  four-group  design,  which  may  also  be  viewed  as  a  factorial  de¬ 
sign,  was  able  to  control  for  this  potential  interaction.  The  primary  ad¬ 
vantage  of  factorial  designs  is  that  they  enable  us  to  empirically  examine 
the  effects  of  more  than  one  independent  variable,  both  individually  and 
in  combination,  on  the  dependent  variable,  as  depicted  in  the  following 
illustration.  The  design,  as  its  name  implies,  allows  us  to  examine  all  pos¬ 
sible  combinations  of  factors  in  the  study: 

R— X— Y— O 

R — X, — Y2 — O 

R— X2— Y— O 

R— X2— Y— O 

To  further  illustrate  the  utility  of  this  design,  let  us  consider  a  situation 
in  which  a  researcher  is  interested  in  examining  how  both  treatment  dose 
(4  vs.  8  sessions)  and  treatment  setting  (client’s  home  vs.  clinical  setting) 
influence  the  effectiveness  of  a  particular  intervention.  Although  the  re¬ 
searcher  could  conduct  separate  two-group  randomized  studies,  this 
would  not  provide  information  on  the  potential  interaction  of  different 
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doses  of  treatment  with  different  treatment  settings.  The  researcher 
might,  for  example,  want  to  test  the  hypothesis  that  higher  doses  of  treat¬ 
ment  provided  in  a  clinical  setting  will  result  in  the  best  treatment  out¬ 
comes.  To  best  examine  this  hypothesis,  the  researcher  could  make  use  of 
a  factorial  design.  This  specific  example  would  be  considered  a  two-by- 
two  (2  X  2)  factorial  design,  because  each  of  the  two  independent  variables 
has  two  levels,  as  illustrated  here: 


Dose 

Low  (4  weeks) 

High  (8  weeks) 

Home 

Clinical 

Following  this  same  notation,  a  study  with  two  independent  variables  in 
which  one  independent  variable  had  three  levels  and  the  other  had  two  lev¬ 
els  would  be  considered  a  two-by-three  (2  X  3)  factorial  design.  Similarly,  a 
study  with  three  two-level  independent  variables  would  be  considered  a 
two-by-two-by-two  (2x2x2)  factorial  design.  Although  a  study  could  have 
any  number  of  independent  variables  with  any  number  of  levels,  it  is 
important  to  note  that  each  additional  independent  variable  that  is  added 
to  the  factorial  design  increases  the  number  of  groups  exponentially. 
Where  a  2  X  2  design  has  four  groups,  a  2  X  2  X  3  design  will  have  12  groups. 

The  factorial  design  has  several  important  strengths.  First,  it  permits 
the  simultaneous  examination  of  more  than  one  independent  variable. 
This  can  be  critical  because  most,  if  not  all,  human  behavior  is  determined 
by  more  than  one  variable.  A  second  and  related  strength  is  the  efficiency 
of  the  factorial  design.  Because  it  allows  us  to  test  several  hypotheses  in  a 
single  research  study,  it  can  be  more  economical  to  use  a  factorial  design 
than  to  conduct  several  individual  studies,  in  terms  of  both  number  of  par¬ 
ticipants  and  researcher  effort.  Last,  and  perhaps  most  important,  the  fac¬ 
torial  design  allows  us  to  look  for  interactions  between  independent  vari¬ 
ables.  Just  as  most  human  behavior  is  influenced  by  more  than  one 
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variable,  it  is  equally  probable  that  no  combination  of  variables  influences 
all  persons  in  the  same  manner  or  influences  human  behavior  the  same 
way  in  all  possible  conditions.  In  other  words,  there  are  no  universal  truths. 
It  is  therefore  critical  to  examine  between-variable  interactions  to  more 
accurately  describe  causal  relationships  (Fisher,  1953;  Ray  &  Ravizza, 
1988). 

Are  Experimental  Designs  Perfect? 

Despite  their  seemingly  ideal  nature,  even  studies  that  employ  experimen¬ 
tal  designs  may  face  threats  to  validity  in  certain  situations  (Cook  &  Camp¬ 
bell,  1979).  Threats  to  validity  will  be  discussed  in  detail  in  Chapter  6,  so 
we  will  not  spend  too  much  time  discussing  them  in  this  chapter.  We  will, 
however,  introduce  you  to  some  of  the  more  common  threats  to  validity. 
The  first  such  threat  occurs  when  a  study’s  control  group  is  inadvertently 
exposed  to  the  intervention  or  when  key  aspects  of  the  intervention  also 
exist  in  the  control  group.  This  can  substantially  diminish  the  unique  as¬ 
pects  of  an  experimental  intervention  and  reduce  any  potential  between- 
group  differences. 

Another  situation  that  may  threaten  a  study’s  validity  (even  with  ran¬ 
domized  experimental  designs)  occurs  when  one  of  the  groups  is  per¬ 
ceived  by  participants  as  better  or  more  desirable  than  the  other.  If  partic¬ 
ipants  in  one  condition  feel  that  those  in  the  other  condition  are  somehow 
receiving  superior  treatment,  they  may  experience  feelings  of  resentment 
toward  the  researcher,  may  feel  demoralized,  or  may  even  try  harder  or 
change  their  behavior  to  compensate.  When  condition  assignment  affects 
participant  behavior  in  this  manner,  a  contrast  effect  has  occurred.  Contrast 
effects  can  have  a  substantial  impact  on  a  study’s  findings. 

Still  another  potential  threat  to  the  validity  of  an  experimental  design 
occurs  when  there  are  substantial  differences  in  the  implementation  of 
the  experimental  and  control  conditions.  For  example,  this  may  occur  if 
the  clinician  delivering  the  experimental  treatment  were  far  more  experi¬ 
enced  or  educated  than  the  one  delivering  the  control  treatment.  This 
could  obviously  confound  the  study’s  findings  by  diminishing  the  re- 
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searcher’s  ability  to  attribute  any  measured  change  to  the  experimental  in¬ 
tervention. 

Finally,  and  very  importandy,  experimental  designs  are  also  not  immune 
to  the  effects  of  differential  participant  mortality  (or  dropout).  This  is  par¬ 
ticularly  likely  when  one  of  the  conditions  is  noxious  or  onerous.  Regard¬ 
less  of  randomization,  participant  dropout  can  substantially  reduce  a 
study’s  internal  validity  by  systematically  creating  two  or  more  very  differ¬ 
ent  groups  and  ultimately  undoing  what  randomization  initially  achieved. 

Another  important  point  about  randomized  experimental  designs  is 
that  randomization,  while  far  superior  to  other  methods  in  ensuring  that 
extraneous  variables  are  distributed  equally  across  groups,  does  not  always 
work.  This  is  of  particular  concern  when  sample  sizes  are  small  (i.e.,  fewer 
than  40  participants  per  group).  Although  researchers  may  attempt  to  ex¬ 
amine  the  integrity  of  randomization  by  comparing  the  study  groups  on  a 
number  of  pretest  measures,  they  can  never  be  certain  that  differences  do 
not  exist.  Ironically,  because  they  lack  sufficient  statistical  power  (i.e.,  the 
ability  to  detect  between-group  differences  if  differences  actually  exist), 
studies  with  small  sample  sizes  are  less  likely  to  find  between-group  dif¬ 
ferences  on  such  measures  (Kazdin,  2003c). 

The  most  obvious  limitation  of  studies  that  employ  a  randomized  ex¬ 
perimental  design  is  their  logistical  difficulty.  Randomly  assigning  partici¬ 
pants  in  certain  settings  (e.g.,  criminal  justice,  education)  may  often  be 
unrealistic,  either  for  logistical  reasons  or  simply  because  it  may  be  con¬ 
sidered  inappropriate  in  a  particular  setting.  Although  efforts  have  been 
made  to  extend  randomized  designs  to  more  real-world  settings,  it  is  often 
not  feasible.  In  such  cases,  the  researcher  often  turns  to  quasi-experi- 
mental  designs. 


QUASI-EXPERI MENTAL  DESIGNS 

As  just  noted,  although  random  assignment  is  the  best  way  to  ensure  the 
internal  validity  of  a  research  study,  it  is  often  not  feasible  in  real-world 
environments.  Therefore,  when  randomized  designs  are  not  feasible,  re¬ 
searchers  must  often  make  use  of  quasi-experimental  designs.  A  good  rule 
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of  thumb  is  that  researchers  should  attempt  to  use  the  most  rigorous  re¬ 
search  design  possible,  striving  to  use  a  randomized  experimental  design 
whenever  possible  (Campbell,  1969). 

Cook  and  Campbell  (1979)  present  a  variety  of  quasi-experimental 
designs,  which  can  be  divided  into  two  main  categories:  nonequivalent 
comparison-group  designs  and  interrupted  time-series  designs.  In  this 
section,  we  will  discuss  these  two  major  groups  of  quasi-experimental  de¬ 
signs,  followed  by  a  brief  overview  of  single-subjects  designs. 

Nonequivalent  Comparison-Group  Designs 

Nonequivalent  comparison-group  designs  are  among  the  most  com¬ 
monly  used  quasi-experimental  designs.  Structurally,  these  designs  are 
quite  similar  to  the  experimental  designs,  but  an  important  distinction  is 
that  they  do  not  employ  random  assignment.  In  using  these  designs,  the 
researcher  attempts  to  select  groups  that  are  as  similar  as  possible.  Unfor¬ 
tunately,  as  indicated  by  the  design’s  name,  it  is  likely  that  the  resulting 
groups  will  be  nonequivalent.  With  careful  analysis  and  cautious  interpre¬ 
tation,  however,  nonequivalent  comparison-group  designs  may  still  lead 
to  some  valid  conclusions  (Graziano  &  Raulin,  2004) . 

Nonequivalent  Groups  Posttest-Only  (Two  or  More  Groups) 

In  the  nonequivalent  groups  posttest-only  design,  one  group  (the  experi¬ 
mental  group)  receives  the  intervention  while  the  other  group  (the  control 
group)  does  not,  as  depicted  here  (NR  =  not  randomized): 

NR— X— O 

NR— X— O 

Unfortunately,  there  is  a  low  probability  that  any  resulting  between-group 
differences  on  the  dependent  variable  could  be  attributed  to  the  interven¬ 
tion,  so  the  results  of  a  study  using  this  design  may  be  considered  largely 
uninterpretable. 

One  potential  application  of  this  design  (Cook  &  Campbell,  1979; 
McGuigan,  1983)  is  a  case  in  which  each  of  the  groups  might  represent  a 
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different  type  of  teaching  method.  If  differences  are  found  in  the  resulting 
test  scores  of  students,  it  may  suggest  that  the  specific  teaching  method 
caused  the  differences.  However,  it  is  equally  possible  that  students  who 
were  likely  to  achieve  higher  grades  were  selected  for  a  specific  teaching 
method.  Ultimately,  even  this  variation  cannot  rule  out  the  serious  threats 
to  internal  validity  that  plague  this  design. 

Nonequivalent  Groups  Pretest-Posttest  (Two  or  More  Groups) 

In  the  nonequivalent  groups  pretest-posttest  design,  the  dependent  vari¬ 
able  is  measured  both  before  and  after  the  treatment  or  intervention,  as 
depicted  here: 

NR—  O— X—  O 
NR—  O— X2— O 

This  gives  it  two  advantages  over  its  posttest  only  counterpart.  First,  with 
the  use  of  both  a  pretest  and  a  posttest,  the  temporal  precedence  of  the  in¬ 
dependent  variable  to  the  dependent  variable  can  be  established.  This  may 
give  the  researcher  more  confidence  when  inferring  that  the  independent 
variable  was  responsible  for  changes  in  the  dependent  variable.  Second, 
the  use  of  a  pretest  allows  the  researcher  to  measure  between-group  dif¬ 
ferences  before  exposure  to  the  intervention.  This  could  substantially  re¬ 
duce  the  threat  of  selection  bias  by  revealing  whether  the  groups  differed 
on  the  dependent  variable  prior  to  the  intervention. 

Interrupted  Time-Series  Designs 

The  time-series  design  is  perhaps  best  described  as  an  extension  of  a  one- 
group  pretest-posttest  design — the  design  is  extended  by  the  use  of  nu¬ 
merous  pretests  and  posttests.  In  this  type  of  quasi-experimental  design, 
periodic  measurements  are  made  on  a  group  prior  to  the  presentation  (in¬ 
terruption)  of  the  intervention  to  establish  a  stable  baseline.  Observing 
and  establishing  the  normal  fluctuation  of  the  dependent  variable  over 
time  allows  the  researcher  to  more  accurately  interpret  the  impact  of  the 
independent  variable.  Following  the  intervention,  several  more  periodic 
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measurements  are  made.  There  are  four  basic  variations  of  this  design: 
the  simple  interrupted  time-series  design,  the  reversal  time-series  design, 
the  multiple  time-series  design,  and  the  longitudinal  design. 

Simple  Interrupted  Time-Series  Design 

The  simple  interrupted  time-series  design  is  a  within-subj  ects  design  in  which  pe¬ 
riodic  measurements  are  made  on  a  single  group  in  an  effort  to  establish  a 
baseline,  as  depicted  here: 

O— O— O— O— X— o— o— o— o 

At  some  point  in  time,  the  independent  variable  is  introduced,  and  it  is 
followed  by  additional  periodic  measurements  to  determine  whether  a 
change  in  the  dependent  variable  occurs. 

According  to  Cook  and  Campbell  (1979),  there  are  two  principal  ways 
in  which  the  independent  variable  can  influence  the  series  of  observations 
after  it  has  been  introduced:  (1)  a  change  in  the  level  and  (2)  a  change  in 
the  slope.  A  sharp  discontinuity  in  the  values  of  the  dependent  variable  at 
the  point  of  interruption  (introduction  of  the  independent  variable)  would 
indicate  a  change  in  level. 

To  better  understand  this,  consider  a  study  in  which  an  employer  was 
using  a  particular  rating  system  to  evaluate  the  employees’  monthly  pro¬ 
ductivity,  before  and  after  offering  them  stock  options.  One  potential  out¬ 
come  might  be  a  dramatic  change  in  employee  productivity.  As  depicted 
in  Figure  5.2,  employee  productivity  ratings  that  hovered  between  2  and  3 


Figure  5.2  An  example  of  a  change  in  level. 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


GENERAL  TYPES  OF  RESEARCH  DESIGNS  AND  APPROACHES  1 4 1 


Figure  5.3  An  example  of  a  change  in  slope. 


prior  to  the  availability  of  stock  options  might  abruptly  rise  to  the  5—6 
range  following  the  company  offer.  Alternatively,  as  depicted  in  Figure  5.3, 
the  employer  might  find  a  steady  increase  in  productivity  following  the 
company  bonus. 

In  addition  to  the  level  and  slope,  the  researcher  can  examine  the  dura¬ 
tion  of  effects  and  whether  they  ultimately  persist  or  decay  over  time.  Fi¬ 
nally,  the  researcher  can  examine  the  ultimate  latency  of  effects  and 
whether  the  effect  was  immediate  or  delayed.  The  more  immediate  the 
change  in  the  dependent  variable,  the  more  likely  that  the  change  is  due  to 
the  influence  of  the  independent  variable.  The  ability  to  examine  changes 
and  trends  across  a  series  of  observations  made  before  and  after  the  inter¬ 
vention  permits  the  researcher  to  more  closely  identify  the  possibility  of 
maturation,  testing,  and  history  as  alternative  explanations.  (Maturation, 
testing,  and  history  are  discussed  further  in  Chapter  6.) 

Although  changes  in  either  level  or  slope  are  often  used  as  the  basis 
for  inferring  a  causal  relationship  between  the  independent  and  depen¬ 
dent  variables,  such  inferences  must  be  made  with  extreme  caution  be¬ 
cause  this  design  does  little  to  control  for  alternative  explanations  for 
measured  change.  For  instance,  in  the  prior  example,  it  may  have  been 
the  employer’s  attention  rather  than  the  bonus  that  led  to  increased 
employee  productivity.  Consequently,  this  design  does  not  permit  a 
researcher  to  draw  causal  inferences  with  any  substantial  degree  of 
certainty. 
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Reversal  Time-Series  Design 

Also  known  as  an  ABA  design  (detailed  on  page  145),  the  reversal  time-series 
design  is  basically  a  multi-subject  variation  of  the  single-subject  reversal  de¬ 
sign,  which  will  be  discussed  later  in  this  chapter.  The  basic  goal  of  this 
design  is  to  establish  causality  by  presenting  and  withdrawing  an  interven¬ 
tion,  or  independent  variable,  one  to  several  times  while  concurrently 
measuring  change  in  the  dependent  variable  (as  depicted  in  the  following) . 
As  in  the  simple  time-series  design,  this  design  begins  with  a  series  of 
pretests  to  observe  normal  fluctuations  in  baseline.  The  name  “reversal” 
refers  to  the  idea  that  causality  can  be  inferred  if  changes  that  occur  fol¬ 
lowing  the  presentation  of  an  intervention  diminish  or  “reverse”  when  the 
independent  variable  is  withdrawn. 

o— o— o— x— o— o— o— REV— o— o— o— x— o— o— O 
(A)  (B)  (A) 

To  fully  appreciate  the  elegance  of  this  design,  consider  the  prior  ex¬ 
ample  in  which  an  employer  offers  a  company  bonus.  Imagine  if,  rather 
than  offering  a  one-time  bonus,  the  employer  offered  a  monthly  bonus  to 
employees  for  2  months,  removed  it  for  2  months,  and  then  again  offered 
it  for  2  months.  If  increases  in  productivity  were  found  following  each 
bonus,  and  decreases  in  productivity  were  found  each  time  the  bonus  was 
removed,  one  could  be  fairly  confident  that  company  bonuses  influenced 
employee  productivity. 

Despite  the  elegance  of  the  reversal  design,  it  is  similar  to  its  single¬ 
subject  counterpart  (to  be  discussed)  in  that  it  is  not  appropriate  for  the 
study  of  all  independent  or  dependent  variables.  The  fact  is  that  the  effects 
of  some  interventions  simply  cannot  be  reversed,  as  with  learning  to  read 
or  learning  to  ride  a  bike.  You  can  offer  and  remove  instruction  on  these 
skills  as  often  as  you  like  and  you  are  still  likely  to  observe  a  learning  curve, 
with  little  reversal.  It  is  therefore  necessary  for  the  researcher  to  carefully 
consider  the  characteristics  of  the  independent  variable  to  be  studied 
when  considering  the  use  of  this  design. 
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Multiple  Time-Series  Design 

This  design  is  essentially  the  same  as  the  nonequivalent  pretest-posttest 
design,  with  the  exception  that  the  dependent  variable  is  measured  at  mul¬ 
tiple  time  points  both  before  and  after  presentation  of  the  independent 
variable,  or  longitudinally  (see  Rapid  Reference  5.5),  as  depicted  here: 

o — o — o — o — x, — o — o — o — o 
o — o — o — o — X2 — o — o — o — o 

Although  this  design  is  not  randomized,  it  can  be  quite  strong  in  terms  of 
its  ability  to  rule  out  other  explanations  for  the  observed  effect.  This  de¬ 
sign  enables  us  to  examine  trends  in  the  data,  at  multiple  time  points,  be¬ 
fore,  during,  and  after  an  intervention  (allowing  us  to  evaluate  the  plausi¬ 
bility  of  certain  threats  to  internal  validity).  Over  and  above  the 
single-group  time-series  design,  however,  this  design  allows  us  to  make 
both  within-group  and  between-group  comparisons,  which  may  further 
reduce  concerns  of  alternative  explanations  associated  with  history. 
Therefore,  the  major  strength  of  this  design  is  that  it  permits  both  within- 
and  between-group  comparisons.  Regrettably,  this  design  does  not  in¬ 
volve  random  assignment  and  thus  is  unable  to  eliminate  all  threats  to  in¬ 
ternal  validity. 


~  flap/d Reference  S.S 


Longitudinal  Designs 

Longitudinal  designs  involve  taking  multiple  measurements  of  each  study 
participant  overtime.  Generally,  the  purpose  of  longitudinal  studies  is  to 
follow  a  case  or  group  of  cases  over  a  period  of  time  to  gather  normative 
data  on  growth,  to  plot  trends,  or  to  observe  the  effects  of  special  factors. 
For  example,  a  researcher  may  want  to  study  the  development  of  more 
than  one  birth  cohort  (i.e.,  a  group  of  individuals  born  in  the  same  calen¬ 
dar  year  or  group  of  years)  to  determine  whether  personality  features  are 
stable  overtime. 
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Single-Subject  Experimental  Designs 

Not  to  be  confused  with  nonexperimental  single-subject  case  studies, 
which  are  covered  later  in  this  chapter,  the  single- subject  experimental  de¬ 
sign  has  a  long  and  respected  tradition  in  empirical  research.  According  to 
Kazdin  (2003c),  single-subject  experiments  might  be  seen  as  true  experi¬ 
ments  because  they  “can  demonstrate  causal  relationships  and  can  rule 
out  or  make  implausible  threats  to  validity  with  the  same  elegance  of 
group  research”  (p.  273).  Similar  to  other  experimental  designs,  the  single¬ 
subject  design  seeks  to  (1)  establish  that  changes  in  the  dependent  variable 
occur  following  introduction  of  the  independent  variable  ( temporal  prece¬ 
dence)  and  (2)  identify  differences  between  study  conditions. 

The  one  way  that  single- subject  designs  differ  from  other  experimental 
designs  is  in  how  they  establish  control,  and  thereby  demonstrate  that 
changes  in  a  dependent  variable  are  not  due  to  extraneous  variables.  For 
example,  experimental  designs  rely  on  randomization  to  equally  distribute 
extraneous  variables  and  on  statistical  techniques  to  control  for  such 
factors  if  they  are  found.  Alternatively,  single-subject  designs  eliminate 
between-subject  variables  by  using  only  one  participant,  and  they  control 
for  relevant  environmental  factors  by  establishing  a  stable  baseline  of  the 
dependent  variable.  If  change  occurs  following  the  introduction  of  the  in¬ 
tervention,  or  independent  variable,  the  researcher  can  reasonably  assume 
that  the  change  was  due  to  the  intervention  and  not  to  extraneous  factors. 

As  with  time-series  designs,  single-subject  designs  typically  begin  by  es¬ 
tablishing  a  stable  baseline.  Establishing  a  stable  baseline  involves  taking  re¬ 
peated  measures  of  a  participant’s  behavior  (dependent  variable)  prior  to 
the  administration  of  any  intervention  to  make  certain  that  the  partici¬ 
pant’s  behavior  is  occurring  at  a  consistent  rate.  To  obtain  a  stable  base¬ 
line,  the  researcher  must  make  special  efforts  to  control  all  relevant  envi¬ 
ronmental  variables  that  otherwise  might  affect  the  participant’s 
responses.  If  the  researcher  does  not  know,  or  is  uncertain,  about  which 
variables  are  relevant,  the  researcher  must  attempt  to  keep  the  partici¬ 
pant’s  environment  as  constant  as  possible  by  maintaining  highly  con¬ 
trolled  conditions. 
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Single-Subject  Reversal  Design 

The  reversal  design  (also  known,  like  the  reversal  time-series  design,  as  the 
ABA)  is  one  of  the  most  widely  used  single-subject  designs.  As  in  the  re¬ 
versal  time-series  design,  the  single-subject  reversal  design  measures  behavior 
during  three  phases:  before  the  intervention  is  introduced  (A),  after  intro¬ 
ducing  the  intervention  (B),  and  again  after  withdrawing  the  intervention 
(A) .  The  primary  goal  of  this  design  is,  first,  to  determine  whether  there 
is  a  change  in  the  dependent  variable  following  the  introduction  of  the 
independent  variable;  and  second,  to  determine  whether  the  dependent 
variable  reverses  or  returns  to  baseline  once  the  independent  variable  is 
withdrawn.  To  rule  out  the  possibility  that  apparent  effects  might  be  due 
to  a  certain  cyclical  pattern  involving  either  maturation  or  practice  (to  be 
discussed  in  Chapter  6),  the  ABA  design  may  be  extended  to  an  ABAB 
design.  To  rule  out  even  more  complicated  maturation  or  practice  effects, 
the  researcher  could  extend  the  design  even  further  to  an  ABABA.  Obvi¬ 
ously,  the  more  measurements  that  are  made,  the  less  likely  it  is  that 
measured  change  is  due  to  anything  other  than  the  intervention,  or  inde¬ 
pendent  variable. 

The  single-subject  reversal  design  has  the  same  limitations  as  its  time- 
series  counterpart.  First,  and  most  obviously,  not  all  behaviors  are  re¬ 
versible.  Certain  behaviors,  such  as  reading,  riding  a  bike,  or  learning  a  lan¬ 
guage,  are  somewhat  permanent.  Second,  withdrawal  of  certain  useful 
interventions  or  curative  treatments  may  be  unethical.  To  address  this  is¬ 
sue,  many  studies  opt  for  the  ABAB  variant,  in  which  the  intervention  is 
repeated  and  is  designated  as  the  final  condition. 

Single-Subject  Multiple-Baseline  Design 

A  second,  very  common  single-subject  approach  is  the  multiple-baseline  de¬ 
sign.  This  design  demonstrates  the  effectiveness  of  a  treatment  by  showing 
that  behaviors  across  more  than  one  baseline  change  as  a  consequence  of 
the  introduction  of  a  treatment.  In  this  design,  several  behaviors  of  a  single 
subject  are  monitored  simultaneously.  Once  stable  baselines  are  estab¬ 
lished  for  all  of  the  behaviors,  one  of  the  behaviors  is  exposed  to  the  in¬ 
tervention.  The  primary  goal  of  this  design  is  to  determine  whether  the 
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behavior  that  is  exposed  to  the  intervention  changes  while  the  other  be¬ 
haviors  remain  constant.  Once  the  first  behavioral  shift  is  identified,  the 
intervention  is  applied  to  the  next  behavior,  and  so  on.  The  logic  behind 
this  design  is  that  it  would  be  highly  unlikely  for  baseline  behaviors  to  suc¬ 
cessively  shift  by  chance. 

For  example,  suppose  a  tutor  wants  to  test  whether  providing  small 
prizes  or  rewards  can  change  two  distinct  behaviors  that  one  of  her  stu¬ 
dents  is  displaying  (i.e.,  asking  questions,  and  attending  tutoring  sessions 
on  time).  The  tutor,  after  establishing  a  stable  baseline  for  both  behaviors, 
observes  that  the  student  asks  an  average  of  3  questions  per  week,  and  at¬ 
tends  tutoring  sessions  on  time  an  average  of  2  times  per  week.  The  tutor 
might  begin  by  giving  the  student  prizes  for  asking  questions  regardless  of 
her  tardiness  for  the  first  two  weeks.  At  this  point,  the  tutor  may  find  that 
the  student  begins  to  ask  an  average  of  5  questions  per  week,  while  her  tar¬ 
diness  remains  the  same.  After  two  weeks,  the  tutor  might  also  begin  giv¬ 
ing  the  student  prizes  for  attending  her  tutoring  sessions  on  time.  In  other 
words,  the  tutor  might  begin  rewarding  both  behaviors.  After  another  two 
weeks,  the  tutor  might  observe  that  the  student’s  average  rate  of  question¬ 
asking  remains  at  5  times  per  week,  but  that  her  average  on-time  atten¬ 
dance  increases  to  4  times  per  week. 

The  primary  limitation  of  the  multiple-baseline  design  is  that  it  requires 
the  use  of  relatively  independent  behaviors.  The  behaviors  that  are  being 
monitored  must  not  be  so  interrelated  that  a  change  in  one  behavior  re¬ 
sults  in  similar  changes  in  others  even  though  the  other  behaviors  were  not 
exposed  to  the  intervention.  For  example,  Kazdin  (1973)  points  out  that 
the  design  would  not  be  useful  for  the  study  of  children’s  classroom  be¬ 
haviors  because  many  of  the  classroom  behaviors  are  interrelated. 

Overall,  single- subject  designs  may  be  an  important  and  logical  alterna¬ 
tive  to  randomized  experimental  designs.  Importantly,  because  of  their  fo¬ 
cus  on  single-subject  behavior,  these  designs  may  be  particularly  suited  for 
clinicians  who  want  to  determine  whether  certain  treatments  are  working 
for  specific  clients  or  patients. 

In  this  section,  we  have  provided  a  brief  overview  of  several  of  the  most 
widely  used  quasi-experimental  designs.  However,  many  other  quasi- 
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experimental  designs  are  available.  In  fact,  there  appears  to  be  a  nearly 
endless  number  of  ways  to  arrange  the  independent  and  dependent  vari¬ 
ables  in  an  attempt  to  answer  experimental  questions  with  some  degree  of 
confidence.  Unfortunately,  despite  their  often  elegant  structure,  quasi- 
experimental  designs  cannot  automatically  rule  out  threats  to  internal  va¬ 
lidity  with  the  same  degree  of  certainty  that  true  experimental  designs  can. 
At  this  point,  however,  the  overall  utility  of  quasi-experimental  designs 
should  be  evident.  Although  they  do  not  enable  us  to  draw  causal  infer¬ 
ences  with  the  same  degree  of  confidence  as  do  randomized  designs,  they 
do  allow  us  to  begin  to  examine  real-world  phenomena  and  begin  to  es¬ 
tablish  causal  inferences  when  true  experimental  designs  are  simply  not 
feasible. 


NON  EXPERIMENTAL  OR  QUALITATIVE  DESIGNS 

In  the  past  two  sections,  we  discussed  experimental  and  quasi-experi¬ 
mental  designs.  Each  of  these  design  classes  can  provide  information 
from  which  to  draw  causal  inferences,  although  to  very  different  degrees 
of  certainty.  This  is  not  the  case  for  nonexperimental  designs  (i.e.,  de¬ 
scriptive  and  correlational  designs).  No  matter  how  convincing  the  data 
from  descriptive  and  correlational  studies  may  appear,  these  nonexperi¬ 
mental  designs  cannot  rule  out  extraneous  variables  as  the  cause  of  what 
is  being  observed  because  they  do  not  have  control  over  the  variables  and 
the  environments  that  they  study.  Although  there  are  many  types  of  non¬ 
experimental  methods,  an  extensive  review  of  these  techniques  and  de¬ 
signs  is  beyond  the  scope  of  this  chapter.  Therefore,  we  will  provide  a  brief 
overview  of  four  of  the  most  widely  used  approaches:  case  studies,  natu¬ 
ralistic  observation,  surveys,  and  focus  groups. 

Case  Studies 

Case  studies  involve  an  in-depth  examination  of  a  single  person  or  a  few 
people.  The  goal  of  the  case  study  is  to  provide  an  accurate  and  complete 
description  of  the  case.  The  principal  benefit  of  case  studies  is  that  they 
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can  expand  our  knowledge  about  the  variations  in  human  behavior.  Al¬ 
though  experimental  researchers  are  typically  interested  in  overall  trends 
in  behavior,  drawing  sample-to-population  inferences,  and  generalizing  to 
other  samples,  the  focus  of  the  case-study  approach  is  on  individuality  and 
describing  the  individual  as  comprehensively  as  possible.  The  case  study 
requires  a  considerable  amount  of  information,  and  therefore  conclusions 
are  based  on  a  much  more  detailed  and  comprehensive  set  of  information 
than  is  typically  collected  by  experimental  and  quasi-experimental  studies. 

Case  studies  of  individual  participants  often  include  in-depth  inter¬ 
views  with  participants  and  collaterals  (e.g.,  friends,  family  members, 
colleagues),  review  of  medical  records,  observation,  and  excerpts  from 
participants’  personal  writings  and  diaries.  Case  studies  have  a  practical 
function  in  that  they  can  be  immediately  applicable  to  the  participant’s  di¬ 
agnosis  or  treatment. 

According  to  Yin  (1994),  the  case-study  design  must  have  the  following 
five  components:  its  research  question(s),  its  propositions,  its  unit(s)  of 
analysis,  a  determination  of  how  the  data  are  linked  to  the  propositions, 
and  criteria  to  interpret  the  findings.  According  to  Kazdin  (1982),  the  ma¬ 
jor  characteristics  of  case  studies  are  the  following: 

•  They  involve  the  intensive  study  of  an  individual,  family,  group, 
institution,  or  other  level  that  can  be  conceived  of  as  a  single  unit. 

•  The  information  is  highly  detailed,  comprehensive,  and  typically 
reported  in  narrative  form  as  opposed  to  the  quantified  scores  on 
a  dependent  measure. 

•  They  attempt  to  convey  the  nuances  of  the  case,  including  specific 
contexts,  extraneous  influences,  and  special  idiosyncratic  details. 

•  The  information  they  examine  may  be  retrospective  or  archival. 

Although  case  studies  lack  experimental  control,  their  naturalistic  and 
uncontrolled  methods  have  set  them  aside  as  a  unique  and  valuable  source 
of  information  that  complements  and  informs  theory,  research,  and  prac¬ 
tice  (Kazdin,  2003c).  According  to  Kazdin,  case  studies  may  be  seen  as 
having  made  at  least  four  substantial  contributions  to  science:  They  have 
served  as  a  source  of  research  ideas  and  hypotheses;  they  have  helped  to 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


GENERAL  TYPES  OF  RESEARCH  DESIGNS  AND  APPROACHES  1 49 


develop  therapeutic  techniques;  they  have  enabled  scientists  to  study  ex¬ 
tremely  rare  and  low-base-rate  phenomena,  including  rare  disorders  and 
one-time  events;  and  they  can  describe  and  detail  instances  that  contradict 
universally  accepted  beliefs  and  assumptions,  thereby  serving  to  plant 
seeds  of  doubt  and  spur  new  experimental  research  to  validate  or  invali¬ 
date  the  accepted  beliefs. 

Case  studies  also  have  some  substantial  drawbacks.  First,  like  all  nonex- 
perimental  approaches,  they  merely  describe  what  occurred,  but  they  can¬ 
not  tell  us  why  it  occurred.  Second,  they  are  likely  to  involve  a  great  deal  of 
experimenter  bias  (refer  back  to  Chapter  3).  Although  no  research  design, 
including  the  randomized  experimental  designs,  is  immune  to  experi¬ 
menter  bias,  some,  such  as  the  case  study,  are  at  greater  risk  than  others. 

The  reason  the  case  study  is  more  at  risk  with  respect  to  experimenter 
bias  is  that  it  involves  considerably  more  interaction  between  the  re¬ 
searcher  and  the  participant  than  most  other  research  methods.  In  addi¬ 
tion,  the  data  in  a  case  study  come  from  the  researcher’s  observations  of 
the  participant.  Although  this  might  also  be  supplemented  by  test  scores 
and  more  objective  measures,  it  is  the  researcher  who  brings  all  this  to¬ 
gether  in  the  form  of  a  descriptive  case  study  of  the  individual(s)  in  ques¬ 
tion. 

Finally,  the  small  number  of  individuals  examined  in  these  studies 
makes  it  unlikely  that  the  findings  will  generalize  to  other  people  with  sim¬ 
ilar  issues  or  problems.  A  case  study  of  a  single  person  diagnosed  with  a 
certain  disorder  is  unlikely  to  be  representative  of  all  individuals  with  that 
disorder.  Still,  the  overall  contributions  of  the  case  study  cannot  be  ig¬ 
nored.  Regardless  of  its  nonexperimental  approach — in  fact,  because  of  its 
nonexperimental  approach — it  has  substantially  informed  theory,  re¬ 
search,  and  practice,  serving  to  fulfill  the  first  goal  of  science,  which  is  to 
identify  issues  and  causes  that  can  then  be  experimentally  assessed. 

Naturalistic  Observation 

Naturalistic  observation  studies,  as  their  name  implies,  involve  observing  or¬ 
ganisms  in  their  natural  settings.  For  example,  a  researcher  who  wants  to 
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Putting  It  Into  Practice 


A  Refresher  on  Eliminating  Experimenter  Bias 

As  discussed  in  Chapter  3,  there  are  several  effective  strategies  for  reduc¬ 
ing  or  eliminating  the  effects  of  experimenter  bias. The  first  strategy  is  to 
develop  and  employ  highly  specific  study  procedures.  Using  clearly  opera¬ 
tionalized  and  standardized  procedures  can  reduce  the  opportunity  for 
bias  to  influence  the  way  that  study  participants  are  treated  and  the  way 
that  data  are  considered  or  analyzed.  A  second  strategy  is  to  reduce  or 
eliminate  experimenter-participant  interactions.  For  example,  studies 
could  be  conducted  via  the  Internet,  or  participants  could  receive  study 
instructions  and  assessments  via  computer  (Kazdin,  2003c).  A  third  strat¬ 
egy  is  to  keep  the  researcher  unaware  of  participants’  specific  group  as¬ 
signments,  typically  referred  to  as  making  the  researcher  blind  or  naive. 
Although  this  may  be  easiest  in  medication  studies  in  which  participants 
receive  either  a  placebo  or  a  real  medication,  it  can  (with  a  bit  more  ef¬ 
fort)  be  employed  in  other  studies.  For  example,  a  study  could  use  multi¬ 
ple  researchers  within  sessions,  so  that  those  who  deliver  the  interven¬ 
tions  are  aware  of  the  group  assignments  and  those  who  administer  the 
dependent  measure  are  not. 


examine  the  socialization  skills  of  children  may  observe  them  while  they 
are  at  a  school  playground,  and  then  record  all  instances  of  effective  or  in¬ 
effective  social  behavior.  The  primary  advantage  of  the  naturalistic  obser¬ 
vation  approach  is  that  it  takes  place  in  a  natural  setting,  where  the  partic¬ 
ipants  do  not  realize  that  they  are  being  observed.  Consequently,  the 
behaviors  that  it  measures  and  describes  are  likely  to  reflect  the  partici¬ 
pants’  true  behaviors. 

In  general,  naturalistic  observation  has  four  defining  principals  (Ray  & 
Ravizza,  1988).  The  first  and  most  fundamental  principle  is  that  of  nonin¬ 
terference.  Researchers  who  engage  in  naturalistic  observation  must  not  dis¬ 
rupt  the  natural  course  of  events  that  they  are  observing.  By  adhering  to 
this  principle,  researchers  can  observe  events  the  way  they  truly  happen. 
Second,  naturalistic  observation  involves  the  observation  and  detection  of 
invariants,  or  behavior  patterns  or  other  phenomena  that  exist  in  the  real 
world.  For  example,  individuals  may  be  found  to  engage  in  similar  ways, 
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on  certain  times  or  days,  in  certain  contexts,  or  when  in  the  company  of 
certain  people  or  groups.  Third,  the  naturalistic  observation  approach  is 
particularly  useful  for  exploratory  purposes,  when  we  know  litde  or  noth¬ 
ing  about  a  certain  subject.  In  this  vein,  naturalistic  observation  can  pro¬ 
vide  a  useful  but  global  description  of  the  participant  and  a  series  of  events 
as  opposed  to  isolated  ones.  Finally,  the  naturalistic  observation  method 
is  basically  descriptive.  Although  it  can  provide  a  somewhat  detailed  de¬ 
scription  of  a  phenomenon,  it  cannot  tell  us  why  the  phenomenon  oc¬ 
curred.  Determining  causation  is  left  to  experimental  designs,  which  were 
discussed  in  detail  earlier  in  this  chapter. 

The  main  limitation  of  the  naturalistic  approach  is  that  the  researcher 
has  no  real  control  over  the  setting.  In  the  hypothetical  study  of  children’s 
socialization  skills,  factors  other  than  a  child’s  gender  may  be  affecting  the 
child’s  social  behavior,  but  the  researcher  may  not  be  aware  of  those  other 
factors.  In  addition,  participants  may  not  have  an  opportunity  to  display 
the  behaviors  or  phenomena  the  researcher  is  trying  to  observe  because  of 
factors  that  are  beyond  the  researcher’s  control.  For  example,  some  of  the 
children  who  are  usually  the  most  aggressive  may  not  be  at  school  that  day 
or  may  instead  be  in  detention  because  of  previous  misconduct,  and  thus 
they  are  not  in  the  sample  of  children  on  the  playground.  A  final  limitation 
is  that  the  topics  of  study  are  limited  to  overt  behavior.  A  researcher  can¬ 
not  study  unobservable  processes  like  attitudes  or  thoughts  using  a  natu¬ 
ralistic  observation  study. 


Survey  Studies 

Survey  studies  ask  large  numbers  of  people  questions  about  their  behaviors, 
attitudes,  and  opinions.  Some  surveys  merely  describe  what  people  say 
they  think  and  do.  Other  survey  studies  attempt  to  find  relationships  be¬ 
tween  the  characteristics  of  the  respondents  and  their  reported  behaviors 
and  opinions.  For  example,  a  survey  could  examine  whether  there  is  a  re¬ 
lationship  between  gender  and  people’s  attitudes  about  some  social  issue. 
When  surveys  are  conducted  to  determine  relationships,  as  for  this  second 
purpose,  they  are  referred  to  as  correlational  studies. 
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Campbell  and  Katona  (1953)  delineated  nine  general  steps  for  con¬ 
ducting  a  survey.  Although  this  list  is  more  than  50  years  old,  it  is  as  useful 
now  as  it  was  then  in  providing  a  clear  overview  of  survey  procedures.  The 
nine  steps  are  as  follows: 

1 .  General  objectives:  This  step  involves  defining  the  general  purpose 
and  goal  of  the  survey. 

2.  Specific  objectives:  This  step  involves  developing  more  specificity 
regarding  the  types  of  data  that  will  be  collected,  and  specifying 
the  hypothesis  to  be  tested. 

3.  Sample:  The  major  foci  of  this  step  are  to  determine  the  specific 
population  that  will  be  surveyed,  to  decide  on  an  appropriate 
sample,  and  to  determine  the  criteria  that  will  be  used  to  select 
the  sample. 

4.  Questionnaire:  The  focus  of  this  step  is  deciding  how  the  sample 
is  to  be  surveyed  (e.g.,  by  mail,  by  phone,  in  person)  and  devel¬ 
oping  the  specific  questions  that  will  be  used.  This  is  a  particu¬ 
larly  important  step  that  involves  determining  the  content  and 
structure  (e.g.,  open-ended,  closed-ended,  Likert  scales;  see 
Rapid  Reference  5.6)  of  the  questions,  as  well  as  the  general  for¬ 
mat  of  the  survey  instrument  (e.g.,  scripted  introduction,  order 
of  the  questions).  Importantly,  the  final  survey  should  be  sub¬ 
jected  to  a  protocol  analysis  in  which  it  is  administered  to  nu¬ 
merous  individuals  to  determine  whether  (a)  it  is  clear  and 
understandable  and  (b)  the  questions  get  at  the  type  of 
information  that  they  were  designed  to  collect.  For  certain 
scales,  such  as  Likert  scales,  you  may  also  want  to  look  for  cer¬ 
tain  response  patterns  to  see  whether  there  is  a  problematic  re¬ 
sponse  set  that  emerges,  as  indicated  by  restricted  variability  in 
responses  (e.g.,  all  items  rated  high,  all  items  rated  low,  or  all 
items  falling  in  between). 

5.  Fieldwork:  This  step  involves  making  decisions  about  the  indi¬ 
viduals  who  will  actually  administer  the  surveys,  and  about  their 
qualifications,  hiring,  and  training. 
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Measurement  Modalities 

Three  of  the  most  common  measurement  modalities  include  open-ended 
questions,  closed-ended  questions,  and  Likert  scales.  An  open-ended 
question  does  not  provide  the  participant  with  a  choice  of  answers.  In¬ 
stead,  participants  are  free  to  answer  the  question  in  any  manner  they 
choose.  An  example  of  an  open-ended  question  is  the  following:“How 
would  you  describe  your  childhood?”  By  contrast,  a  closed-ended  ques¬ 
tion  provides  the  participant  with  several  answers  from  which  to  choose. 
A  common  example  of  a  closed-ended  question  is  a  multiple-choice 
question,  such  as  the  following:“How  would  you  describe  your  childhood? 
(a)  happy;  (b)  sad;  (c)  boring.”  Finally,  a  Likert  scale  asks  participants  to 
provide  a  response  along  a  continuum  of  possible  responses.  Here’s  an 
example  of  a  Likert  scale:  "My  childhood  was  happy.  (I )  strongly  agree;  (2) 
agree;  (3)  neutral;  (4)  disagree;  (5)  strongly  disagree.” 


6.  Content  analysis:  This  involves  transforming  the  often  qualitative, 
open-ended  survey  responses  into  quantitative  data.  This  may 
involve  developing  coding  procedures,  establishing  the  reliabil¬ 
ity  of  the  coding  procedures,  and  developing  careful  data  screen¬ 
ing  and  cleaning  procedures. 

7.  Analysis plan:  In  general,  these  procedures  are  fairly  straightfor¬ 
ward  because  the  analysis  of  survey  data  is  typically  confined  to 
descriptive  and  correlational  statistics.  Still,  even  survey  studies 
should  have  clear  statistical  analysis  plans. 

8.  Tabulation:  This  step  involves  decisions  about  data  entry. 

9.  Analysis  and  reporting:  As  with  all  studies,  the  final  steps  are  to 
conduct  the  data  analyses,  prepare  a  final  report  or  manuscript, 
and  disseminate  the  study’s  findings. 

Although  a  variety  of  methods  for  administering  surveys  are  available, 
the  most  popular  are  face-to-face,  telephone,  and  mail.  In  general,  each  of 
these  methods  has  its  own  advantages  and  disadvantages.  The  major  con¬ 
sideration  for  the  researcher  in  deciding  on  the  form  of  survey  adminis- 
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tration  is  response  rate  versus  cost.  As  a  rule  of  thumb  (Ray  &  Ravizza, 
1988),  if  high  rate  of  return  is  the  main  goal,  then  face-to-face  or  telephone 
surveys  are  the  optimal  choices,  while  mail  surveys  are  the  obvious  choice 
when  cost  is  an  issue. 

The  principal  advantage  of  survey  studies  is  that  they  provide  informa¬ 
tion  on  large  groups  of  people,  with  very  little  effort,  and  in  a  cost- 
effective  manner.  Surveys  allow  researchers  to  assess  a  wider  variety  of  be¬ 
haviors  and  other  phenomena  than  can  be  studied  in  a  typical  naturalistic 
observation  study. 

Focus  Groups 

Focus  groups  are  formally  organized,  structured  groups  of  individuals 
brought  together  to  discuss  a  topic  or  series  of  topics  during  a  specific  pe¬ 
riod  of  time.  Like  surveys,  focus  groups  can  be  an  extremely  useful  tech¬ 
nique  for  obtaining  individuals’  impressions  and  concerns  about  certain 
issues,  services,  or  products. 

Originally  developed  for  use  in  marketing  research,  focus  groups  have 
served  as  a  principal  method  of  qualitative  research  among  social  scien¬ 
tists  for  many  decades.  In  contrast  to  other,  unilateral  methods  of  obtain¬ 
ing  qualitative  data  (e.g.,  observation,  surveys),  focus  groups  allow  for  in¬ 
teractions  between  the  researcher  and  the  participants  and  among  the 
participants  themselves. 

Like  most  other  qualitative  research  methods,  there  is  no  one  definitive 
way  to  design  or  conduct  a  focus  group.  However,  they  are  typically  com¬ 
posed  of  several  participants  (usually  6  to  10  individuals)  and  a  trained 
moderator.  Fewer  than  6  participants  may  restrict  the  diversity  of  the  opin¬ 
ions  to  be  offered,  and  more  than  10  may  make  it  difficult  for  everyone 
to  express  their  opinions  comprehensively  (Hoyle,  Harris,  &  Judd,  2002). 
Focus  groups  are  also  typically  made  up  of  individuals  who  share  a  partic¬ 
ular  characteristic,  demographic,  or  interest  that  is  relevant  to  the  topic  be¬ 
ing  studied.  For  example,  a  marketing  researcher  may  want  to  conduct  a 
focus  group  with  parents  of  young  children  to  determine  the  desirability 
of  a  new  educational  product.  Similarly,  a  criminal  justice  researcher  inter- 
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ested  in  developing  methods  of  reducing  criminal  recidivism  may  choose 
to  conduct  focus  groups  with  recent  parolees  to  discuss  problems  that 
they  encountered  after  being  released  from  prison. 

The  presence  of  a  trained  moderator  is  critical  to  the  focus-group  pro¬ 
cess  (Hoyle  et  al.,  2002).  The  moderator  is  directly  responsible  for  setting 
the  ground  rules,  raising  the  discussion  topics,  and  maintaining  the  focus 
of  the  group  discussions.  When  setting  the  ground  rules,  the  moderator 
must,  above  all,  discuss  issues  of  confidentiality,  including  the  confiden¬ 
tiality  of  all  information  shared  with  and  recorded  by  the  researchers  (also 
covered  when  obtaining  informed  consent).  In  addition,  the  moderator 
will  often  request  that  all  participants  respect  each  other’s  privacy  by  keep¬ 
ing  what  they  hear  in  the  focus  groups  confidential.  Other  ground  rules 
may  involve  speaking  one  at  a  time  and  avoiding  criticizing  the  expressed 
viewpoints  of  the  other  participants. 

Considerable  preparation  is  necessary  to  make  a  focus  group  success¬ 
ful.  The  researcher  must  carefully  consider  the  make-up  of  the  group  (of¬ 
ten  a  nonrepresentative  sample  of  convenience),  prepare  a  list  of  objec¬ 
tives  and  topics  to  be  covered,  and  determine  clear  ground  rules  to  be 
communicated  to  the  group  participants.  When  considering  the  questions 
and  topics  to  be  covered,  the  researcher  should  again  take  into  account  the 
make-up  of  the  group  (e.g.,  intelligence  level,  level  of  impairment)  as  well 
as  the  design  of  the  questions.  For  example,  when  possible,  moderators 
should  avoid  using  closed-ended  questions,  which  may  not  generate  a 
great  deal  of  useful  dialogue.  Similarly,  moderators  should  avoid  using 
“why”  questions.  Questions  that  begin  with  “why”  may  elicit  socially  ap¬ 
propriate  rationalizations,  best  guesses,  or  other  attributions  about  an  in¬ 
dividual’s  behavior  when  the  person  is  unsure  or  unaware  of  the  true  rea¬ 
sons  or  underlying  motivations  for  his  or  her  behavior  (Nisbett  &  Wilson, 
1977).  Instead,  it  may  be  more  fruitful  to  ask  participants  about  what  they 
do  and  the  detailed  events  surrounding  their  behaviors.  This  may  ulti¬ 
mately  shed  more  light  on  the  actual  precipitants  of  participants’  behav¬ 
iors.  Overall,  focus  groups  should  attempt  to  cover  no  more  than  two  to 
three  major  topics  and  should  last  no  more  than  1  1/2  to  2  hours. 

The  obvious  advantage  of  a  focus  group  is  that  it  provides  an  open, 
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fairly  unrestricted  forum  for  individuals  to  discuss  ideas  and  to  clarify  each 
others’  impressions  and  opinions.  The  group  format  can  also  serve  to 
crystallize  the  participants’  opinions.  However,  focus  groups  also  have 
several  disadvantages.  First,  because  of  their  relatively  small  sample  sizes 
and  the  fact  that  they  are  typically  not  randomly  selected,  the  information 
gleaned  from  focus  groups  may  not  be  representative  of  the  population 
in  general.  Second,  although  the  group  format  may  have  some  benefits  in 
terms  of  helping  to  flesh  out  and  distill  perceptions  and  concerns,  it  is  also 
very  likely  that  an  individual’s  opinions  can  be  altered  through  group  in¬ 
fluence.  Finally,  it  is  difficult  to  quantify  the  open-ended  responses  result¬ 
ing  from  focus  group  interactions. 

The  information  obtained  from  focus  groups  can  provide  useful  in¬ 
sight  into  how  various  procedures,  systems,  or  products  are  viewed,  as  well 
as  the  desires  and  concerns  of  a  given  population.  For  these  reasons,  focus 
groups,  similar  to  other  qualitative  research  methods,  often  form  the  start¬ 
ing  point  in  generating  hypotheses,  developing  questionnaires  and  sur¬ 
veys,  and  identifying  the  relevant  issues  that  may  be  examined  using  more 
quantifiable  research  methodologies. 

SUMMARY 

In  this  chapter,  we  have  provided  a  brief  introduction  to  the  three  main 
classes  of  research  design:  experimental,  quasi-experimental,  and  nonex- 
perimental/qualitative.  In  addition  to  providing  a  general  overview  of 
these  design  types,  we  hope  that  we  have  given  the  reader  a  stronger  ap¬ 
preciation  for  the  subtleties  of  experimental  design,  and  the  ways  that 
small  variations  can  affect  the  researcher’s  ability  to  rule  out  alternative  ex¬ 
planations  and  infer  causation.  We  also  hope  to  have  conveyed  an  appro¬ 
priate  respect  for  quasi-  and  nonexperimental  designs.  Although  these  de¬ 
signs  do  not  provide  researchers  with  the  same  amount  of  confidence  in 
their  conclusions,  they  are  often  necessary  given  the  specific  parameters  of 
the  topic  under  investigation  or  the  inability  to  study  a  specific  phenome¬ 
non  in  a  true  experimental  fashion.  Perhaps  most  important,  these  quasi- 
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and  nonexperimental  designs  often  provide  the  foundation,  preliminary 
data,  and  conceptual  framework  from  which  scientifically  testable  hy¬ 
potheses  are  built. 


..4S*  TEST  YOURSELF 


1 .  The  most  important  element  of  a  true  experimental  design  is _ 

assignment. 

2.  If  groups  are  perfectly  matched  on  all  known  factors,  the  researcher  can 
be  certain  that  any  group  differences  on  outcomes  are  due  to  the  indepen¬ 
dent  variable.  True  or  False? 

3.  In  randomized  two-group  designs,  participants  are  typically  assigned  by 

random  selection  to  either  an  experimental  or  a _ group. 

4.  Reversal  or  ABA  designs  cannot  be  used  in  all  instances  because  some 
phenomena  and  behaviors  are  simply  not  reversible.  True  or  False? 

5.  A  guided  discussion  to  explore  a  group’s  opinions  and  impressions  on  a 

specific  topic  area  is  known  as  a _ . 

Answers:  I .  random;  2.  False  (It  is  still  possible  that  any  number  of  unknown  variables  may  be 

responsible  for  the  group  differences.);  3.  control;  4.True;  5.  focus  group 
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VALIDITY 


1  Malidity  is  an  important  term  in  research  that  refers  to  the  conceptual 
1 a  and  scientific  soundness  of  a  research  study  (Graziano  &  Raulin, 
w  2004) .  As  previously  discussed,  the  primary  purpose  of  all  forms  of 
research  is  to  produce  valid  conclusions.  Furthermore,  researchers  are  in¬ 
terested  in  explanations  for  the  effects  and  interactions  of  variables  as  they 
occur  across  a  wide  variety  of  different  settings.  To  truly  understand  these 
interactions  requires  special  attention  to  the  concept  of  validity,  which 
highlights  the  need  to  eliminate  or  minimize  the  effects  of  extraneous  in¬ 
fluences,  variables,  and  explanations  that  might  detract  from  a  study’s  ul¬ 
timate  findings. 

Validity  is,  therefore,  a  very  important  and  useful  concept  in  all  forms  of 
research  methodology.  Its  primary  purpose  is  to  increase  the  accuracy  and 
usefulness  of  findings  by  eliminating  or  controlling  as  many  confounding 
variables  as  possible,  which  allows  for  greater  confidence  in  the  findings  of 
a  given  study.  There  are  four  distinct  types  of  validity  (internal  validity,  ex¬ 
ternal  validity,  construct  validity,  and  statistical  conclusion  validity)  that  in¬ 
teract  to  control  for  and  minimize  the  impact  of  a  wide  variety  of  extrane¬ 
ous  factors  that  can  confound  a  study  and  reduce  the  accuracy  of  its 
conclusions.  This  chapter  will  discuss  each  type  of  validity,  its  associated 
threats,  and  its  implications  for  research  design  and  methodology. 


INTERNAL  VALIDITY 

Internal  validity  refers  to  the  ability  of  a  research  design  to  rule  out  or  make 
implausible  alternative  explanations  of  the  results,  or  plausible  rival  hy- 
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DON'T  FORGET 


Internal  Validity  and  Plausible  Rival  Hypotheses 

Internal  validity:  The  ability  of  a  research  design  to  rule  out  or  make 
implausible  alternative  explanations  of  the  results,  thus  demonstrating  that 
the  independent  variable  was  directly  responsible  forthe  effect  on  the  de¬ 
pendent  variable  and,  ultimately,  forthe  results  found  in  the  study. 

Plausible  rival  hypotheses:  An  alternative  interpretation  of  the  re¬ 
searcher’s  hypothesis  about  the  interaction  of  the  independent  and  de¬ 
pendent  variables  that  provides  a  reasonable  explanation  of  the  findings 
otherthan  the  researcher’s  original  hypothesis. 

potheses  (Campbell,  1957;  Kazdin,  2003c).  A  plausible  rival  hypothesis  vs,  an 
alternative  interpretation  of  the  researcher’s  hypothesis  about  the  interac¬ 
tion  of  the  independent  and  dependent  variables  that  provides  a  reason¬ 
able  explanation  of  the  findings  other  than  the  researcher’s  original  hypo¬ 
thesis  (Rosnow  &  Rosenthal,  2002). 

Although  evidence  of  absolute  causation  is  rarely  achieved,  the  goal  of 
most  experimental  designs  is  to  demonstrate  that  the  independent  variable 
was  directly  responsible  for  the  effect  on  the  dependent  variable  and,  ulti¬ 
mately,  the  results  found  in  the  study.  In  other  words,  the  researcher  ulti¬ 
mately  wants  to  know  whether  the  observed  effect  or  phenomenon  is  due 
to  the  manipulated  independent  variable  or  variables  or  to  some  uncon¬ 
trolled  or  unknown  extraneous  variable  or  variables  (Pedhazur  & 
Schmelkin,  1991).  Ideally,  at  the  conclusion  of  the  study,  the  researcher 
would  like  to  make  a  statement  reflecting  some  level  of  causation  between 
the  independent  and  dependent  variables.  By  designing  strong  experimen¬ 
tal  controls  into  a  study,  internal  validity  is  increased  and  rival  hypotheses 
and  extraneous  influences  are  minimized.  This  allows  the  researcher  to  at¬ 
tribute  the  results  of  the  study  more  confidently  to  the  independent  variable 
or  variables  (Kazdin  2003c;  Rosnow  &  Rosenthal,  2002).  Uncontrolled  ex¬ 
traneous  influences  other  than  the  independent  variable  that  could  explain 
the  results  of  a  study  are  referred  to  as  threats  to  internal  validity. 
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Putting  It  Into  Practice 


An  Example  of  Internal  Validity  and  Plausible 
Rival  Hypotheses 

A  researcher  is  interested  in  the  effectiveness  of  two  different  parental 
skills  training  and  education  programs  on  improving  symptoms  of  depres¬ 
sion  in  adolescents. The  researcher  recruits  1 00  families  that  meet  speci¬ 
fied  inclusion  criteria  in  the  study.The  primary  inclusion  criterion  is  that 
the  family  must  have  an  adolescent  who  currently  meets  criteria  for  de¬ 
pression.  After  recruitment,  the  researcher  then  randomly  assigns  the 
families  into  one  of  the  two  skills  training  programs. The  parents  receive 
the  interventions  over  a  1 0-week  period  and  are  then  sent  home  to  apply 
the  skills  they  have  learned. The  researcher  reevaluates  the  adolescents  6 
months  later  to  see  whether  there  has  been  improvement  in  the  adoles¬ 
cents’  symptoms  of  depression. The  results  suggest  that  both  groups  im¬ 
proved. The  researcher  concludes  that  both  parental  skills  training  inter¬ 
ventions  were  effective  fortreating  depression  in  adolescents.  Given  the 
limited  information  here,  is  this  an  appropriate  conclusion? 

The  answer,  of  course,  is  no.This  study  has  poor  internal  validity  because 
it  is  impossible  to  say  with  any  certainty  that  the  independent  variable 
(the  two  skills  training  classes)  had  an  effect  on  the  dependent  variable 
(depression). There  are  a  number  of  alternative  rival  hypotheses  that  have 
not  been  controlled  for  and  could  just  as  easily  explain  the  results  of  the 
study.  Many  things  could  have  transpired  overthe  course  of  the  6  months. 
For  example,  were  certain  adolescents  placed  on  medication?  Would 
they  have  improved  without  the  intervention?  Did  their  life  circumstances 
change  for  the  better?  We  will  never  know  because  the  study  has  poor  in¬ 
ternal  validity  and  does  not  control  for  even  the  simplest  and  most  obvi¬ 
ous  alternative  explanations. 


Threats  to  Internal  Validity 

Although  the  terminology  may  vary,  the  most  commonly  encountered 
threats  to  internal  validity  are  history,  maturation,  instrumentation,  test¬ 
ing,  statistical  regression,  selection  biases,  attrition,  diffusion  or  imitation 
of  treatment,  and  special  treatment  or  reactions  of  controls  (Christensen, 
1988;  Cook  &  Campbell,  1979;  Kazdin,  2003c;  Pedhazur  &  Schmelkin, 
1991).  Researchers  must  be  aware  that  every  methodological  design  is  sub- 
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DON'T  FORGET 


Threats  to  Internal  Validity 

As  discussed  in  Chapters  3  and  5,  most  threats  to  internal  validity  are 
controlled  through  statistical  analyses,  control  and  comparison  groups, 
and  randomization. The  underlying  assumption  of  randomization  as  it  ap¬ 
plies  to  internal  validity  is  that  extraneous  factors  are  evenly  distributed 
across  all  groups  within  the  study.  Control  groups  allow  for  direct  compar¬ 
ison  between  experimental  groups  and  the  evaluation  of  suspected  extra¬ 
neous  influences.  Statistical  controls  are  typically  used  when  participants 
cannot  be  randomly  assigned  to  experimental  conditions,  and  involve  sta¬ 
tistically  controlling  for  variables  that  the  researcher  has  identified  as  dif¬ 
fering  between  groups. 


ject  to  at  least  some  of  these  potential  threats  and  control  for  them  ac¬ 
cordingly.  Failure  to  implement  appropriate  controls  affects  the  re¬ 
searcher’s  ability  to  infer  causality. 

History 

Generally,  history  as  a  threat  to  internal  validity  refers  to  events  or  incidents 
that  take  place  during  the  course  of  the  study  that  might  have  an  unin¬ 
tended  and  uncontrolled-for  impact  on  the  study’s  final  outcome  (or  the 
dependent  variable;  Kazdin,  2003c).  These  events  tend  to  be  global 
enough  that  they  affect  all  or  most  of  the  participants  in  a  study.  They  can 
occur  inside  or  outside  the  study  and  typically  occur  between  the  pre-  and 
postmeasurement  phases  of  the  dependent  variable.  The  impact  of  history 
as  a  threat  to  internal  validity  is  usually  seen  during  the  postmeasurement 
phase  of  the  study  and  is  particularly  prevalent  if  the  study  is  longitudinal 
and  therefore  takes  place  over  a  long  period  of  time.  Accordingly,  the 
longer  the  period  of  time  between  the  pre-  and  postmeasure,  the  greater 
the  possibility  that  a  history  effect  could  have  confounded  the  results  of 
the  study  (Christensen,  1988). 

For  example,  an  anxiety-provoking  catastrophic  national  event  could 
have  an  impact  on  many  if  not  all  participants  in  a  study  for  the  treatment 
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of  anxiety.  The  event  could  produce  an  escalation  in  symptoms  that  might 
be  interpreted  as  a  failure  of  the  intervention,  when,  in  actuality,  it  is  an 
artifact  of  the  external  event  itself.  Depending  on  the  timing,  this  external 
event  could  have  a  significant  impact  on  the  measurement  of  the  depen¬ 
dent  variable. 

Another  example  can  be  found  in  our  previous  discussion  of  the  effec¬ 
tiveness  of  parent  skills  training  on  adolescent  symptoms  of  depression 
(see  Putting  It  Into  Practice  on  page  160).  In  that  example,  symptoms  of 
depression  were  evaluated  6  months  after  the  parental  skills  training  inter¬ 
vention.  It  is  possible  that  some  other  significant  event  occurred  during 
that  time  period  that  might  account  for  the  reduced  symptoms  of  depres¬ 
sion.  One  possibility  is  that  school  ended  for  the  year  and  summer  vaca¬ 
tion  started,  which  produced  a  decrease  in  depressive  symptoms  among 
the  sample  of  adolescents.  So,  the  decrease  in  depression  might  be  due  to 
a  historical  artifact  and  not  to  the  independent  variable  (i.e.,  the  parent 
skills  training  intervention) .  Historical  events  can  also  take  place  within 
the  confines  of  the  study,  although  this  is  less  common.  For  example,  an 
argument  between  two  researchers  that  takes  place  in  plain  view  of  partic¬ 
ipants  and  is  not  part  of  the  intended  intervention  is  an  event  that  can  pro¬ 
duce  a  history  effect. 

Maturation 

This  threat  to  internal  validity  is  similar  to  history  in  that  it  relates  to 
changes  over  time.  Unlike  history,  however,  maturation  refers  to  intrinsic 
changes  within  the  participants  that  are  usually  related  to  the  passage  of  time. 
The  most  commonly  cited  examples  of  this  involve  both  biological  and 
psychological  changes,  such  as  aging,  learning,  fatigue,  and  hunger  (Chris¬ 
tensen,  1988).  As  with  history,  the  presence  of  maturational  changes  oc¬ 
curs  between  the  pre-  and  postmeasurement  phases  of  the  study  and  in¬ 
terferes  with  interpretations  of  causation  regarding  the  independent  and 
dependent  variables.  Historical  and  maturational  threats  tend  to  be  found 
in  combination  in  longitudinal  studies. 

In  our  parent  skills  training  example,  might  the  symptoms  of  depres¬ 
sion  have  improved  because  the  parents  had  an  additional  6  months  to 
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develop  as  parents,  regardless  of  the  skills  training?  Although  it’s  unlikely, 
this  is  an  alternative  rival  hypothesis  that  must  be  considered  and  con¬ 
trolled  for,  most  likely  through  the  inclusion  of  a  control  or  comparison 
group  that  did  not  receive  the  parent  skills  training. 

Another  example  would  be  a  study  examining  the  effects  of  visualiza¬ 
tion  on  strength  training  in  male  adolescents  over  a  specified  period  of 
time.  As  adolescent  males  mature  naturally,  we  would  expect  to  see  incre¬ 
mental  increases  in  strength  regardless  of  the  visualization  intervention. 
So,  a  causal  statement  regarding  the  effects  of  visualization  on  strength  in 
adolescent  males  would  have  to  be  qualified  in  the  context  of  the  matura- 
tional  threat  to  internal  validity.  Again,  this  threat  could  be  minimized 
through  the  use  of  control  or  comparison  groups. 

Instrumentation 

This  threat  to  internal  validity  is  unrelated  to  participant  characteristics  and 
refers  to  changes  in  the  assessment  of  the  independent  variable,  which  are  usu¬ 
ally  related  to  changes  in  the  mea¬ 
suring  instrument  or  measurement 
procedures  over  time  (Chris¬ 
tensen,  1988;  Kazdin,  2003c).  In 
essence,  instrumentation  compro¬ 
mises  internal  validity  when 
changes  in  the  dependent  variable 
result  from  changes  over  time  in 
the  assessment  instruments  and 
scoring  criteria  used  in  the  study. 

There  is  a  wide  variety  of  measure¬ 
ment  and  assessment  techniques 
available  to  researchers,  and  some 
of  these  are  more  susceptible  to  in¬ 
strumentation  effects  than  others. 

The  susceptibility  of  a  measure  to 
instrumentation  bias  is  usually  a 
function  of  standardization. 


DON'T  FORGET 


Important  Considerations 
Regarding 
Instrumentation 

•  Standardization  refers  to  the 
guidelines  established  in  the  ad¬ 
ministration  and  scoring  of  an 
instrument  or  other  assessment 
method. 

•  Reliability  is  present  when  an  as¬ 
sessment  method  measures  the 
characteristics  of  interest  in  a 
consistent  fashion. 

•  Validity  is  present  when  the  ap¬ 
proach  to  measurement  used  in 
the  study  actually  measures 
what  it  is  supposed  to  measure. 
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Standardisation  refers  to  the  guidelines  established  in  the  administration 
and  scoring  of  an  instrument  or  other  assessment  method,  and  also  en¬ 
compasses  the  psychometric  concepts  of  reliability  and  validity.  An  ap¬ 
proach  to  measurement  is  reliable  if  it  assesses  the  characteristics  of  inter¬ 
est  in  a  consistent  fashion.  Validity  refers  to  whether  the  approach  to 
measurement  used  in  the  study  actually  measures  what  it  is  supposed  to 
measure.  Instruments  that  are  standardized  and  psychometrically  sound 
are  least  susceptible  to  instrumentation  effects,  while  other  types  of  as¬ 
sessment  methods  (e.g.,  independent  raters,  clinical  impressions,  “home¬ 
made”  instruments)  dramatically  increase  the  possibility  of  instrumenta¬ 
tion  effects. 

For  example,  a  researcher  could  use  a  number  of  measurement  ap¬ 
proaches  in  a  treatment  study  of  depression.  The  researcher  could  use,  for 
example,  a  standardized  measure  to  assess  symptoms  of  depression,  such 
as  the  Beck  Depression  Inventory  (BDI),  which  is  a  self-report,  paper- 
and-pencil  test  known  for  its  reliability  and  validity  (Beck  et  al.,  1 961).  The 
BDI  is  also  standardized  in  that  respondents  are  all  exposed  to  the  same 
stimuli,  which  is  a  set  of  questions  related  to  symptoms  of  depression. 
This  high  level  of  standardization  in  administration  and  scoring  makes  it 
unlikely  that  instrumentation  effects  would  be  present.  In  other  words, 
unless  the  researchers  altered  the  items  of  the  BDI,  modified  the  adminis¬ 
tration  procedures,  or  switched  to  a  different  version  of  the  instrument 
midway  through  the  study,  we  would  not  expect  instrumentation  to  be  a 
significant  threat  to  the  internal  validity  of  the  study. 

Conversely,  other  approaches  to  measurement  are  more  susceptible  to 
possible  instrumentation  effects.  There  are  many  different  ways  to  mea¬ 
sure  the  construct  of  depression.  Let’s  assume  that  the  BDI  was  unavail¬ 
able,  so  the  researcher  had  to  rely  on  some  other  method  for  assessing  the 
impact  of  treatment  on  symptoms  of  depression.  A  common  solution  to 
this  problem  might  be  to  have  independent  raters  assess  the  level  of  symp¬ 
toms  based  on  clinical  diagnostic  criteria  and  then  assess  the  participants 
over  the  course  of  the  intervention.  This  type  of  approach  to  measure¬ 
ment,  if  poorly  implemented,  dramatically  increases  the  likelihood  of  in¬ 
strumentation  effects. 
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The  primary  concern  is  that  the 
raters  might  have  different  stan¬ 
dards  for  what  qualifies  as  meet¬ 
ing  the  criteria  for  symptoms  of 
depression.  Let’s  assume  that  rater 
A  requires  significandy  more  im¬ 
pairment  in  functioning  from  a 
participant  before  acknowledging 
that  depression  or  depressive 
symptoms  are  actually  present. 

Furthermore,  the  rater  standards 
for  identifying  the  symptoms  and 
making  the  diagnosis  of  depres¬ 
sion  might  fluctuate  significantly 
over  time,  which  adds  yet  another  layer  of  difficulty  when  the  researcher 
attempts  to  interpret  the  impact  of  treatment  (the  independent  variable) 
on  depression  (the  dependent  variable).  Without  standardization,  there  is 
a  significant  likelihood  that  any  changes  in  the  dependent  variable  over  the 
course  of  treatment  might  be  the  result  of  changes  in  scoring  criteria  and 
not  the  intervention  itself.  These  issues  are  usually  addressed  through  on¬ 
going  training  and  frequent  interrater  reliability  checks  (a  statistical  method 
for  determining  the  level  of  consistency  and  agreement  between  different 
raters). 

Testing 

This  threat  to  internal  validity  refers  to  the  effects  that  taking  a  test  on  one 
occasion  may  have  on  subsequent  administrations  of  the  same  test 
(Kazdin,  2003c).  In  essence,  when  participants  in  a  study  are  measured 
several  times  on  the  same  variable  (e.g.,  with  the  same  instrument  or  test), 
their  performance  might  be  affected  by  factors  such  as  practice,  memory, 
sensitization,  and  participant  and  researcher  expectancies  (Pedhazur  & 
Schmelkin,  1991).  This  threat  to  internal  validity  is  most  often  encoun¬ 
tered  in  longitudinal  research  where  participants  are  repeatedly  measured 
on  the  same  variables  over  time.  The  ultimate  concern  with  this  threat  to 


C  A  CJ  T I  0  l\ 


Instrumentation  Effects 

Instrumentation  effects  are  least 
prevalent  when  using  standard¬ 
ized,  psychometrically  sound  in¬ 
struments  to  measure  the  vari¬ 
ables  of  interest.  When  such 
measures  are  not  available,  the 
likelihood  of  instrumentation  ef¬ 
fects  rises  dramatically.  In  such 
cases,  ongoing  training  of  raters 
and  interrater  reliability  checks  are 
an  absolute  necessity. 
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internal  validity  is  that  the  results  of  the  study  might  be  related  to  the  re¬ 
peated  testing  or  evaluation  and  not  the  independent  variable  itself. 

For  example,  let’s  consider  a  hypothetical  study  designed  to  assess  the 
impact  of  guided  imagery  techniques  on  the  retention  of  a  series  of  ran¬ 
dom  symbols.  First,  each  participant  is  exposed  to  the  random  symbols 
and  then  asked  to  reproduce  as  many  as  possible  from  memory  after  a  1 5- 
minute  delay.  This  serves  as  a  pretest  or  baseline  measure  of  memory  per¬ 
formance.  Next,  participants  are  exposed  to  the  intervention,  which  is  a 
series  of  guided  imagery  techniques  that  the  researchers  believe  will  im¬ 
prove  retention  of  the  symbols.  The  researchers  believe  that  recall  of  the 
symbols  will  increase  as  participants  learn  each  of  six  imagery  techniques, 
with  the  highest  level  of  recall  coming  after  participants  have  learned  all  of 
the  imagery  techniques.  In  this  case,  the  guided  imagery  technique  is  the 
intervention  or  independent  variable,  and  the  recall  of  the  random  sym¬ 
bols  is  the  dependent  variable.  The  participants  are  exposed  to  six  learn¬ 
ing  trials.  During  each  trial,  the  participant  is  taught  a  new  imagery  tech¬ 
nique,  exposed  to  the  same  random  symbol  stimuli,  and  then  asked  to 
reproduce  as  many  as  possible  after  a  15-minute  delay.  Ideally,  the  partici¬ 
pants  are  using  their  imagery  techniques  to  aid  in  retention  of  the  symbols. 
Keep  in  mind  here  that  the  participants  are  being  tested  on  the  same  set  of 
symbols  on  six  different  occasions,  and  that  the  symbol  set  in  this  example 
is  the  testing  instrument  and  outcome  measure.  The  researchers  run  their 
trials  and  confirm  their  hypotheses.  The  participants  perform  above  base¬ 
line  expectations  after  the  first  trial  and  their  performance  improves  con¬ 
sistently  as  they  are  exposed  to  additional  imagery  techniques.  The  best 
performance  is  seen  after  the  final  imagery  technique  is  implemented. 

Can  it  be  said  that  the  imagery  techniques  are  the  cause  of  the  improved 
retention  of  the  random  symbols?  The  researchers  could  make  that  asser¬ 
tion,  but  the  presence  of  a  testing  effect  seriously  undermines  the  credi¬ 
bility  of  their  results.  Remember  that  the  participants  are  exposed  to  the 
same  test  or  outcome — the  random  symbols — on  at  least  seven  different 
occasions.  This  introduces  a  strong  plausible  rival  hypothesis  that  the  im¬ 
provement  in  retention  is  simply  due  to  a  practice  effect,  or  the  repeated  ex- 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


VALIDITY  1 67 


posure  to  the  same  stimuli.  As  the  researchers  did  not  account  for  this  pos¬ 
sibility  with  a  control  group  or  by  varying  the  content  of  the  symbol  stim¬ 
ulus,  this  remains  a  legitimate  explanation  for  the  findings.  In  other  words, 
the  practice  effect  provides  a  plausible  alternative  hypothesis. 

Statistical  Regression 

This  threat  to  internal  validity  refers  to  a  statistical  phenomenon  whereby 
extremely  high  or  low  scores  on  a  measure  tend  to  revert  toward  the  arith¬ 
metic  mean  or  average  of  the  distribution  with  repeated  testing  (Chris¬ 
tensen,  1988;  Kazdin,  2003c;  Neale  &  Liebert,  1973). 

For  example,  let’s  assume  that  we  obtained  the  following  array  of  scores 
on  our  symbol  retention  measure  from  the  preceding  example:  5,  12,  18, 
19, 27, 42,  55,  and  62.  The  mean  for  this  set  of  scores  is  30  (240  -s-  8  =  30). 
On  average,  the  participants  in  the  study  recalled  30  random  symbols 
when  assessed  for  retention.  Generally,  statistical  regression  suggests  that 
over  time  and  repeated  administration  of  the  memory  assessment,  we 
would  expect  the  scores  in  this  array  to  revert  closer  to  the  mean  score  of 
30.  This  is  particularly  true  of  extreme  scores  that  lie  far  outside  the  nor¬ 
mal  range  of  a  distribution.  These  extreme  scores  are  also  known  as  outliers. 
In  a  distribution  of  scores  with  a  mean  of  30,  it  would  be  reasonable  to 
identify,  at  a  minimum,  the  scores  of  5  and  62  as  outliers.  So,  on  our  next 
administration  of  the  memory  test,  we  would  expect  all  of  these  scores  to 
revert  closer  to  the  mean,  regardless  of  the  effect  of  the  intervention  (or  indepen¬ 
dent  variable).  In  addition,  we  would  probably  see  the  largest  movement 
toward  the  mean  in  the  more  extreme  scores. 

This  phenomenon  is  particu- 
prevalent  in  research  in 
which  a  pre-  and  posttest  design  is 
used  to  assess  the  variable  of  in¬ 
terest  or  when  participants  are  as¬ 
signed  to  experimental  groups 
based  on  extreme  scores.  Let’s 
consider  a  different  example  to  il- 


DOIT’T  FORGET 


Outliers 

An  outlier  is  a  score  lying  far  out¬ 
side  the  normal  range  of  a  distri¬ 
bution  of  scores. 
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lustrate  this  point.  A  study  is  designed  to  assess  the  impact  of  a  new,  10- 
week  treatment  for  anxiety.  The  researchers  are  interested  in  the  effects  of 
their  new  treatment  on  low,  medium,  and  high  anxiety  levels  as  deter¬ 
mined  by  a  score  on  a  standardized  measure  of  anxiety.  The  researchers 
hope  that  their  new  treatment  will  reduce  symptoms  of  anxiety  across 
each  of  the  three  conditions.  Accordingly,  each  participant  is  administered 
the  anxiety  measure  as  a  pretest  to  determine  his  or  her  current  anxiety 
level  and  then  is  assigned  to  one  of  three  groups — low,  medium,  or  high 
anxiety — on  the  basis  of  predetermined  cutoff  scores.  For  the  sake  of  clar¬ 
ity,  let’s  assume  the  mean  anxiety  level  for  the  entire  sample  was  30,  the 
mean  for  the  low- anxiety  group  was  12,  the  mean  for  the  medium-anxiety 
group  was  29,  and  the  mean  for  the  high-anxiety  group  was  42. 

Each  of  these  groups  then  receives  ongoing  treatment  and  assessment 
over  the  10-week  protocol.  The  results  of  the  study  suggest  that  anxiety 
scores  increased  in  the  low- anxiety  condition,  stayed  roughly  the  same  in 
the  medium-anxiety  condition,  and  decreased  in  the  high-anxiety  condi¬ 
tion.  Our  somewhat  befuddled  researchers  conclude  that  their  treatment 
is  effective  only  for  cases  of  severe  anxiety,  exacerbates  symptoms  in  indi¬ 
viduals  with  minimal  symptoms  of  anxiety,  and  has  little  to  no  effect  on 
moderate  levels  of  anxiety.  Although  these  findings  might  be  accurate,  it  is 
also  possible  that  they  are  the  result  of  statistical  regression.  The  scores  in 
the  high- anxiety  group  might  have  reverted  to  the  overall  group  mean  over 
the  1 0  weeks,  giving  the  impression  that  symptom  reduction  resulted  from 
the  intervention.  Similarly,  the  perceived  increase  in  symptoms  in  the  low- 
anxiety  group  might  be  the  result  of  those  low  scores’  moving  toward  the 
overall  group  mean.  In  other  words,  the  mean  scores  for  both  of  these 
groups  included  extreme  scores,  or  outliers,  which  were  then  influenced 
by  regression  to  the  mean.  It  is  therefore  possible  that  we  would  have  seen 
the  same  results  even  without  the  impact  of  the  independent  variable. 
Note  that  the  medium-anxiety  group  did  not  change  and  that  this  was  the 
group  whose  mean  score  was  closest  to  the  overall  sample  mean,  which 
makes  it  least  susceptible  to  the  effects  of  statistical  regression.  This  could 
account  for  the  possibly  erroneous  conclusion  that  the  treatment  proto¬ 
col  was  ineffective  on  moderate  symptoms  of  anxiety. 
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Selection  Biases 

This  threat  to  internal  validity  refers  to  systematic  differences  in  the  as¬ 
signment  of  participants  to  experimental  conditions.  As  noted  in  Chapter 
5,  selection  biases  are  prevalent  in  quasi-experimental  research  in  which 
participants  are  assigned  to  experimental  conditions  or  comparison 
groups  in  a  nonrandom  fashion  (Christensen,  1988;  Kazdin,  2003c;  Ros- 
now  &  Rosenthal,  2002) .  Remember,  randomization  is  designed  to  control 
for  systematic  participant  differences  across  experimental  and  control 
groups.  In  essence,  randomization  evenly  distributes  and  equates  groups 
on  any  potential  confounding  variables.  Without  randomization,  it  is  more 
difficult  to  account  and  control  for  these  systematic  variations  in  partici¬ 
pant  characteristics.  As  with  all  threats  to  internal  validity,  selection  bias 
can  have  a  negative  impact  on  the  researcher’s  ability  to  draw  causal  infer¬ 
ences  about  the  effects  of  the  independent  variable. 

As  mentioned  previously,  selection  biases  are  common  in  quasi- 
experimental  research  in  which  randomization  cannot  be  accomplished. 
The  most  common  example  of  this  is  when  the  experimenter  attempts  to 
conduct  research  in  a  setting  or  under  a  set  of  circumstances  where  the 
groups  are  already  formed  and  cannot  be  altered.  In  other  words,  for 
whatever  reason,  randomization  is  not  feasible  or  possible. 

For  example,  let’s  consider  a  design  to  test  the  effectiveness  of  a  classroom 
intervention  to  improve  mathematics  skills  in  two  classes  of  third  graders. 
Because  the  students  are  already 
assigned  to  classes,  randomization 
is  not  possible,  and  the  study  is 
therefore  quasi-experimental  in 
nature.  Both  classes  receive  a 
grade-appropriate  pretest.  Class  1 
receives  the  mathematics  interven¬ 
tion  and  Class  2  does  not.  In  this 
case,  Class  2  is  acting  as  a  control 
group  because  it  does  not  receive 
the  intervention.  Both  classes  then 
receive  a  posttest. 

term  LinG  -  live,  informative.  Non-cost  and  Genuine  \ 


C  A  CJ  T I  0  l\ 


Selection  Biases 

Selection  biases  are  common  in 
quasi-experimental  designs  and 
can  interact  with  other  threats  to 
internal  validity,  such  as  matura¬ 
tion,  history,  or  instrumentation, 
to  produce  effects  that  might  not 
be  attributable  to  the  independent 
variable. 
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If  Class  1  performs  better,  is  it  safe  to  conclude  that  the  intervention, 
or  independent  variable,  is  responsible  for  the  improvement?  Although  it 
is  possible,  there  are  a  number  of  plausible  rival  hypotheses  that  have  not 
been  controlled  for.  Most  of  these  hypotheses  revolve  around  preexisting 
differences  between  the  two  groups  (i.e.,  before  the  intervention  was  de¬ 
livered).  For  example,  it  is  possible  that  the  students  in  Class  1  are  more 
motivated  or  mature  than  their  counterparts  in  Class  2.  In  fact,  any  preex¬ 
isting  difference  between  the  compositions  of  the  two  groups  is  a  threat  to 
internal  validity.  Any  of  these  differences  might  provide  a  valid  explana¬ 
tion  for  the  results  of  the  math  intervention. 

Attrition 

This  threat  to  internal  validity  refers  to  the  differential  and  systematic  loss 
of  participants  from  experimental  and  control  groups.  In  essence,  partic¬ 
ipants  drop  out  of  the  study  in  a  systematic  and  nonrandom  way  that  can 
affect  the  original  composition  of  groups  formed  for  the  purposes  of  the 
study  (Beutler  &  Martin,  1999).  The  potential  net  result  of  attrition  is  that 
the  effects  of  the  independent  variable  might  be  due  to  the  loss  of  partic¬ 
ipants  and  not  to  the  manipulation  of  the  independent  variable. 

Commentators  have  noted  that  this  threat  to  internal  validity  is  com¬ 
mon  in  longitudinal  research  and  is  a  direct  function  of  time  (Kazdin, 
2003c;  Phillips,  1985).  In  general,  attrition  rates  average  between  40  and 
60%  in  longitudinal  intervention  research,  with  most  participants  drop¬ 
ping  out  during  the  earliest  stages  of  the  study  (Kazdin).  Attrition  applies 
to  most  forms  of  group  and  single-case  designs  and  can  be  a  threat  to  in¬ 
ternal  validity  even  after  the  researcher  has  randomly  assigned  participants 
to  experimental  and  control  groups.  This  is  because  attrition  occurs  as  the 
study  progresses  and  after  participants  have  been  assigned  to  each  of  the 
conditions.  Attrition  raises  the  possibility  that  the  groups  differ  on  certain 
characteristics  that  were  originally  controlled  for  through  randomization. 
In  other  words,  the  remaining  participants  no  longer  represent  the  origi¬ 
nal  sample  and  the  groups  might  no  longer  be  equivalent. 

Let’s  consider  an  example.  A  researcher  decides  to  conduct  a  study  of 
the  effectiveness  of  a  new  drug  on  symptoms  of  anxiety.  Randomization 
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is  used  to  assign  participants  to  either  a  medication  (i.e.,  experimental) 
group  or  placebo  (i.e.,  control)  group.  Let’s  assume  that  over  the  course  of 
the  study,  participants  in  the  experimental  group  experience  some  rela¬ 
tively  severe  side  effects  from  the  medication  and  an  increase  in  anxiety, 
causing  some  to  drop  out  of  the  study.  The  placebo  group  does  not  expe¬ 
rience  the  side  effects,  so  the  dropout  rate  is  lower  in  that  group.  The  av¬ 
erage  anxiety  levels  of  the  two  groups  are  compared  at  the  conclusion  of 
the  study,  and  the  results  suggest  that  the  participants  in  the  medication 
group  are  less  anxious  than  those  in  the  placebo  group.  The  results  seem 
to  support  the  conclusion  that  the  medication  was  effective  for  the  treat¬ 
ment  of  anxiety.  The  problem  with  this  conclusion  is  that  the  results  are 
potentially  confounded  by  attrition.  If  no  study  participants  had  dropped 
out  of  the  medication  group,  it  is  likely  that  the  results  would  have  been 
different.  In  this  example,  notice  that  attrition  was  still  a  factor  after  ran¬ 
domization  and  that  the  final  sample  was  probably  very  different  from  the 
original  sample  used  to  form  the  experimental  and  control  groups. 

Diffusion  or  Imitation  of  Treatment 

This  threat  to  internal  validity  is  common  in  various  forms  of  medical  and 
psychotherapy  treatment  effectiveness  research,  and  it  manifests  itself  in 
two  distinct  but  related  sets  of  circumstances. 

The  first  set  of  circumstances  is  the  unintended  exposure  of  a  control 
group  to  the  actual  or  similar  intervention  (independent  variable)  in¬ 
tended  only  for  the  experimental  condition  (Kazdin,  2003c;  Pedhazur  & 
Schmelkin,  1991).  Let’s  consider  a  study  examining  the  relative  benefits  of 
exercise  and  nutritional  counseling  on  weight  loss.  The  researchers  hy¬ 
pothesize  that  exercise  is  more  effective  than  nutritional  counseling  and 
assign  participants  to  an  exercise,  nutritional  counseling,  or  no¬ 
intervention  control  group.  The  experimental  group  receives  a  cus¬ 
tomized  exercise  regimen,  the  nutritional  group  receives  general  nutri¬ 
tional  counseling,  and  the  control  group  is  simply  monitored  for  weight 
loss  or  gain  for  the  same  time  period. 

During  the  course  of  the  study,  a  well-intentioned,  but  misguided,  nu¬ 
tritional  counselor  extols  the  benefits  of  exercise  to  the  members  of  the 
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nutritional  counseling  group.  This  additional  counseling  was  not  part  of 
the  original  design  and  the  researchers  are  unaware  that  it  is  taking  place. 
Although  the  nutritional  counseling  group  is  not  receiving  the  actual  ex¬ 
ercise  intervention,  the  discussion  of  exercise  with  this  group  might  have 
an  unintended  and  uncontrolled-for  effect.  For  example,  this  knowledge 
might  encourage  participants  in  the  nutritional  group  to  seek  out  their 
own  exercise  program  or  to  change  their  day-to-day  habits  in  such  a  way 
that  increases  their  general  activity  level,  such  as  taking  the  stairs  instead 
of  the  elevator.  If  that  is  indeed  the  case,  then  the  nutritional  group  has  re¬ 
ceived  a  similar  intervention  as  the  experimental  group.  At  a  minimum,  the 
results  could  be  confounded  because  the  nutritional  condition  is  not  be¬ 
ing  delivered  as  the  researchers  had  originally  intended,  because  the  exer¬ 
cise  condition  has  diffused  into  the  nutritional  group.  The  threat  to  inter¬ 
nal  validity  in  this  example  lies  in  the  possibility  that  the  exercise  and 
nutritional  groups  have  now  received  similar  interventions,  which  might 
equalize  performance  across  the  groups  (Kazdin,  2003c). 

The  second  set  of  circumstances  arises  when  the  experimental  group 
does  not  receive  the  intended  intervention  at  all  (Kazdin,  2003c;  Pedhazur 
&  Schmelkin,  1991).  In  the  first  case,  participants  in  a  control  group  either 
gain  knowledge  about  or  are  unintentionally  exposed  to  the  experimental 
intervention  (the  independent  variable).  In  this  case,  the  researcher  be¬ 
lieves  that  the  experimental  group  has  received  the  intervention  when,  in 
reality,  it  has  not.  This  is  a  common  threat  in  many  forms  of  psychotherapy 

research.  Take,  for  example,  a 
study  comparing  the  effectiveness 
of  behavioral  and  psychodynamic 
therapies  for  depression.  Two 
therapists  are  recruited  and 
trained  to  deliver  the  interven¬ 
tions.  Both  therapists  are  psycho¬ 
dynamic  in  their  orientation,  so 
one  receives  supplemental  train¬ 
ing  in  behavioral  techniques.  Par¬ 
ticipants  receive  one  of  the  two 
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Diffusion  or  Imitation 
of  Treatment 

Diffusion  or  imitation  of  treatment 
is  a  threat  to  internal  validity  be¬ 
cause  it  can  equalize  the  perfor¬ 
mance  of  experimental  and  con¬ 
trol  groups. 
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treatments  and  the  results  suggest  that  they  are  both  equally  effective. 
What  the  researchers  do  not  know  is  that  the  behavioral  therapist  has  ei¬ 
ther  intentionally  or  unintentionally  strayed  from  the  specified  protocol  at 
times  and  included  elements  of  the  psychodynamic  treatment  in  the  be¬ 
havioral  condition.  In  other  words,  the  behavioral  group  might  not  have 
received  a  behavioral  intervention  at  all.  At  best,  they  have  received  a  hy¬ 
brid  of  psychodynamic  and  behavioral  treatment.  As  in  our  previous  ex¬ 
ample,  rather  than  comparing  two  distinct  conditions,  the  researchers 
might  be  comparing  two  conditions  that  are  more  similar  than  intended  by 
the  original  research  design.  Again,  this  might  equalize  the  performance  of 
the  experimental  and  control  groups,  which  could  have  the  effect  of  dis¬ 
torting  or  clouding  the  results  of  the  study. 

Special  Treatment  or  Reactions  of  Controls 

These  relatively  common  threats  to  internal  validity  may  be  caused  by  the 
special,  often  compensatory,  treatment  or  attention  given  to  the  control 
group.  Even  in  the  absence  of  special  attention  or  treatment,  controls  may 
realize  that  they  are  in  a  “lesser”  condition  and  react  by  competing  or  oth¬ 
erwise  improving  their  performance.  Either  of  these  situations  can  equal¬ 
ize  the  performance  of  the  experimental  and  control  conditions  and 
thereby  “washout”  between-group  differences  on  the  dependent  variable 
(Christensen,  1988;  Kazdin,  2003c;  Pedhazur  &  Schmelkin,  1991).  Special 
treatment  itself  is  a  relatively  common  threat  to  internal  validity  and  can 
be  related  to  any  number  of  activities  conducted  with  the  control  (nonin¬ 
tervention)  group.  Remember  that  in  this  case,  the  intervention  is  also  the 
independent  variable.  These  factors  range  from  simple  human  interaction 
to  more  concrete  examples  such  as  financial  compensation  or  special  priv¬ 
ileges.  For  example,  attention  alone  might  produce  an  unintended  change 
in  behavior. 

Let’s  assume  that  there  are  two  groups  in  a  study  of  depression.  The  in¬ 
tervention  or  experimental  group  receives  therapy  while  the  control  group 
is  simply  monitored  weekly  for  symptom  severity.  The  monitoring  con¬ 
sists  of  an  hour-long  structured  interview  with  a  research  assistant.  This 
weekly  social  attention  might  act  as  an  intervention  despite  the  fact  that  it 
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was  intended  for  monitoring  purposes  only.  Perhaps  the  interview  gives 
the  control  participants  the  opportunity  to  discuss  their  symptoms,  which 
produces  some  symptom  relief  even  without  therapy  per  se.  After  all,  so¬ 
cial  support  has  been  linked  to  positive  outcomes  for  depression.  The 
same  effect  might  be  observed  even  in  the  absence  of  human  contact.  For 
example,  just  filling  out  a  self-report  measure  of  depressive  symptoms  in 
an  empty  room  might  have  the  same  effect  by  raising  the  awareness  of  the 
control  participants  in  regard  to  their  current  symptom  level.  Reinforcers 
and  other  incentives  might  have  a  similar  effect.  Giving  the  control  par¬ 
ticipants  money  or  special  privileges  might  have  an  impact  on  levels  of  de¬ 
pression  by  raising  self-esteem  or  reducing  hopelessness.  Like  diffusion  or 
imitation  of  treatment,  this  threat  to  internal  validity  might  equalize  the 
performance  of  the  experimental  and  control  groups,  which  could  have 
the  effect  of  distorting  or  clouding  the  results  of  the  study. 

In  conclusion,  threats  to  the  internal  validity  of  a  study  (summarized  in 
Rapid  Reference  6.1)  are  common  and,  at  times,  unavoidable.  They  can  oc¬ 
cur  alone  or  in  combination,  and  they  can  create  unwanted  plausible  alter¬ 
native  hypotheses  for  the  results  of  a  study.  These  rival  hypotheses  may 
make  it  difficult  to  determine  causation.  Some  of  these  threats  can  be  han¬ 
dled  effectively  through  design  components  (e.g.,  control  groups  and  ran¬ 
domization)  at  the  outset  of  the  study,  while  others  (e.g.,  attrition)  take 
place  during  the  course  of  the  study.  Accounting  for  these  threats  is  a  crit¬ 
ical  aspect  and  function  of  research  methodology  that  should  take  place, 
if  possible,  at  the  design  stage  of  the  study.  Refer  to  Chapter  3  for  a  gen¬ 
eral  discussion  of  these  strategies. 

EXTERNAL  VALIDITY 

External  validity  is  concerned  with  the  generalizability  of  the  results  of  a  re¬ 
search  study.  In  all  forms  of  research  design,  the  results  and  conclusions  of 
the  study  are  limited  to  the  participants  and  conditions  as  defined  by  the 
contours  of  the  study.  External  validity  (compare  to  ecological  validity  in  Rapid 
Reference  6.2)  refers  to  the  degree  to  which  research  results  generalize  to 
other  conditions,  participants,  times,  and  places  (Graziano  &  Raulin,  2004). 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


VALIDITY  1 75 


— flap/d Reference  6.  / 


Threats  to  Internal  Validity 

•  History:  Global  internal  or  external  events  or  incidents  that  take 
place  during  the  course  of  the  study  that  might  have  unintended  and 
uncontrol led-for  impacts  on  the  study’s  final  outcome  (i.e.,  on  the  de¬ 
pendent  variable). 

•  Maturation:  Intrinsic  changes  within  the  participants  that  are  usually 
related  to  the  passage  of  time. 

•  Instrumentation:  Changes  in  the  assessment  of  the  independent 
variable  that  are  usually  related  to  changes  in  the  measuring  instrument 
or  measurement  procedures  overtime. 

•  Testing: The  effects  that  taking  a  test  on  one  occasion  may  have  on 
subsequent  administrations  of  the  test.  It  is  most  often  encountered  in 
longitudinal  research,  in  which  participants  are  repeatedly  measured  on 
the  same  variables  of  interest  overtime. 

•  Statistical  regression:  Statistical  phenomenon,  prevalent  in  pretest 
and  posttest  designs,  in  which  extremely  high  or  low  scores  on  a  mea¬ 
sure  tend  to  revert  toward  the  mean  of  the  distribution  with  repeated 
testing. 

•  Selection  bias:  Systematic  differences  in  the  assignment  of  partici¬ 
pants  to  experimental  conditions. 

•  Attrition:  Loss  of  research  participants  that  may  alter  the  original 
composition  of  groups  and  compromise  the  validity  of  the  study. 

•  Diffusion  or  imitation  of  treatment:  Unintended  exposure  of  a 
control  group  to  an  intervention  intended  only  for  the  experimental 
group,  or  a  failure  to  expose  the  experimental  group  to  the  intended 
intervention. This  confound  most  commonly  occurs  in  medical  and  psy¬ 
chological  intervention  studies. 

•  Special  treatment  or  reactions  of  controls:  Relatively  common 
threats  to  internal  validity  in  which  either  ( I )  special  or  compensatory 
treatment  or  attention  is  given  to  the  control  condition,  or  (2)  partici¬ 
pants  in  the  control  condition,  as  a  result  of  their  assignment,  react  or 
compensate  in  a  manner  that  improves  or  otherwise  alters  their  per¬ 
formance. 
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Ecological  and  Temporal  Validity 

Although  the  terms  “ecological  validity”  and  “external  validity”  are  some¬ 
times  used  interchangeably,  a  clear  distinction  can  be  drawn  between  the 
two.  Of  the  two,  external  validity  is  a  more  general  concept.  It  refers  to  the 
degree  to  which  research  results  generalize  to  other  conditions,  partici¬ 
pants,  times,  and  places,  and  it  is  ultimately  concerned  with  the  conclu¬ 
sions  that  can  be  drawn  about  the  strength  of  the  inferred  causal  relation¬ 
ship  between  the  independent  and  dependent  variables  to  circumstances 
beyond  those  experimentally  studied.  Ecological  validity  is  a  more  specific 
concept  that  refers  to  the  generalization  of  findings  obtained  in  a  labora¬ 
tory  setting  to  the  real  world. 

Temporal  validity  is  anotherterm  that  is  related  broadly  to  external  validity. 
It  refers  to  the  extent  to  which  the  results  of  a  study  can  be  generalized 
across  time.  More  specifically,  this  type  of  validity  refers  to  the  effects  of 
seasonal,  cyclical,  and  person-specific  fluctuations  that  can  affect  the  gen- 
eralizability  of  the  study's  findings. 


Therefore,  a  study  has  more  external  validity  when  the  results  generalize 
beyond  the  study  sample  to  other  populations,  settings,  and  circumstances. 

External  validity  refers  to  conclu¬ 
sions  that  can  be  drawn  about  the 
strength  of  the  inferred  causal  re¬ 
lationship  between  the  indepen¬ 
dent  and  dependent  variables  to 
circumstances  beyond  those  ex¬ 
perimentally  studied.  In  other 
words,  would  the  results  of  our 
study  apply  to  different  popula¬ 
tions,  settings,  or  sets  of  circum¬ 
stances?  If  so,  then  the  study  has 
strong  external  validity. 

For  example,  let’s  consider  a 
study  designed  to  determine  the 
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External  Validity 

External  validity  is  the  degree  to 
which  research  results  generalize 
to  other  conditions,  participants, 
times,  and  places.  External  validity  is 
related  to  conclusions  that  can  be 
drawn  about  the  strength  of  the  in¬ 
ferred  causal  relationship  between 
the  independent  and  dependent 
variables  to  circumstances  beyond 
those  experimentally  studied. 
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effectiveness  of  a  new  intervention  for  test  anxiety.  Again,  the  intervention 
is  the  independent  variable,  while  test  anxiety  is  the  dependent  variable. 
The  study  is  being  conducted  at  a  major  East  Coast  university,  and  the  par¬ 
ticipants  are  college  freshmen  currently  taking  an  introductory-level  psy¬ 
chology  class.  Although  this  might  not  seem  realistic  at  first  glance,  many 
studies  are  conducted  with  college  students  because  they  are  easily  acces¬ 
sible  and  form  samples  of  convenience  (Kazdin,  2003c).  Students  are  as¬ 
sessed  to  determine  their  levels  of  test  anxiety  and  then  are  assigned  to  ei¬ 
ther  a  no-treatment  control  group  or  an  experimental  group  that  receives 
the  intervention.  The  new  therapy  is  remarkably  effective  and  significantly 
reduces  test  anxiety  in  the  experimental  group.  The  researchers  immedi¬ 
ately  market  their  intervention  as  being  a  generally  effective  treatment  for 
test  anxiety.  Can  the  researchers  support  their  claim  based  on  the  results  of 
their  study?  Hopefully,  you  have  already  realized  that  this  study  has  serious 
flaws  related  to  internal  validity,  but  let’s  put  that  aside  for  the  purposes  of 
this  example  and  focus  only  on  issues  surrounding  external  validity. 

Remember  that  external  validity  is  the  degree  to  which  research  results 
generalize  to  other  conditions,  participants,  times,  and  places.  A  study  has 
external  validity  when  the  results  generalize  to  other  populations,  settings, 
and  circumstances.  In  our  example,  the  researchers  have  found  that  their 
intervention  effectively  reduces  test  anxiety,  and  they  are  assuming  that  it 
is  effective  across  a  wide  variety  of  settings  and  populations.  They  might 
be  correct,  but  the  design  of  this  study  does  not  have  strong  external  va¬ 
lidity  for  a  number  of  reasons,  which  undermines  the  assertion  that  the  in¬ 
tervention  is  effective  for  other  populations. 

First,  the  study  was  conducted  with  a  sample  of  college  freshmen  en¬ 
rolled  in  an  introductory-level  psychology  course.  This  is  a  very  narrow 
sample;  would  the  results  apply  to  broader  populations,  such  as  elemen¬ 
tary  school  children,  high  school  students,  or  college  seniors?  Would  the 
results  apply  to  college  freshmen  who  were  not  enrolled  in  an  introductory- 
level  psychology  class?  We  do  not  know  for  certain  because  these  individ¬ 
uals  were  not  included  in  the  sample  used  in  the  study. 

Second,  do  the  results  apply  to  other  settings,  such  as  different  univer¬ 
sities,  high  schools,  classes,  and  business  environments?  The  effectiveness 
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of  the  intervention  might  be  limited  to  the  setting  where  the  study  was 
conducted.  For  example,  we  might  find  that  the  results  do  not  generalize 
to  universities  on  the  West  Coast  or  to  high  schools.  In  other  words,  the 
effectiveness  of  the  intervention  might  be  specific  to  the  population  rep¬ 
resented  by  the  sample  used  in  the  study. 

Third,  is  there  something  unique  about  the  conditions  of  the  study?  For 
example,  was  the  study  conducted  around  midterm  or  final  exams,  when 
anxiety  levels  might  be  unusually  high?  Would  the  intervention  have  been 
as  effective  if  the  study  had  occurred  at  a  different  time  during  the  semes¬ 
ter?  As  mentioned  previously,  the  answer  is  that  we  do  not  know  for  sure. 
In  terms  of  external  validity,  the  most  accurate  statement  that  can  be  made 
from  the  results  of  our  hypothetical  study  is  that  the  intervention  was  ef¬ 
fective  for  college  freshmen  in  introductory-level  psychology  classes  at  a 
major  East  Coast  university.  Any  other  conclusions  would  not  necessarily 
be  supported,  and  additional  research  across  different  times,  places,  and 
conditions  would  be  necessary  to  support  any  other  conclusions. 


Threats  to  External  Validity 

As  with  internal  validity,  there  are  confounds  and  characteristics  of  a  study 
that  can  limit  the  generalizability  of  the  results.  These  characteristics  and 
confounds  are  collectively  referred  to  as  threats  to  external  validity,  and  they 
include  sample  characteristics,  stimulus  characteristics  and  settings,  reac¬ 
tivity  of  experimental  arrangements,  multiple-treatment  interference, 
novelty  effects,  reactivity  of  assessment,  test  sensitization,  and  timing  of 
measurement  (Kazdin,  2003c).  Controlling  these  influences  allows  the  re¬ 
searchers  to  more  confidently  generalize  the  results  of  the  study  to  other 
circumstances  and  populations  (Kazdin;  Rosnow  &  Rosenthal,  2002). 

Sample  Characteristics 

This  threat  to  external  validity  refers  to  a  phenomenon  whereby  the  results 
of  a  study  apply  only  to  a  particular  sample.  Accordingly,  it  is  unclear  whether 
the  results  can  be  applied  to  other  samples  that  vary  on  characteristics  such 
as  age,  gender,  education,  and  socioeconomic  status  (Kazdin,  2003c). 
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An  example  of  sample  characteristics  can  be  found  in  our  earlier  dis¬ 
cussion  about  external  validity.  In  that  example,  we  noted  that  the  sample 
consisted  of  college  freshmen  enrolled  in  an  introductory-level  psychol¬ 
ogy  class.  As  we  noted,  we  cannot  assume  that  the  findings  of  that  study 
would  necessarily  hold  true  for  a  different  sample,  such  as  high  school  stu¬ 
dents  or  elementary  school  children.  In  addition,  we  cannot  even  assume 
that  the  findings  would  hold  true  for  college  freshmen  generally.  Through 
further  research,  we  might  discover  that  the  intervention  was  effectively 
only  for  psychology  students  and  did  not  generalize  to  freshmen  taking 
introductory-level  business  or  science  classes.  In  other  words,  even  this 
subtle  difference  in  sample  characteristics  can  have  a  significant  effect  on 
the  generalizability  of  a  study’s  results.  Clearly,  it  would  not  be  possible  or 
practical  to  include  every  possible  population  characteristic  in  our  sample, 
so  we  are  always  faced  with  the  possibility  that  sample  characteristics  are  a 
confound  to  the  external  validity  of  any  study.  Accordingly,  conclusions 


DON'T  FORGET 


Diversity  Characteristics 

Sample  characteristics  can  encompass  a  wide  variety  of  traits  and  demo¬ 
graphic  characteristics,  with  some  of  the  most  common  being  age,  gender; 
education,  and  socioeconomic  status.  Commentators  have  noted  that 
some  diversity-related  characteristics  are  not  well  represented  in  most 
forms  of  research  (Kazdin,  2003c). The  primary  concern  in  this  area  is  that 
there  is  an  overrepresentation  of  some  groups,  such  as  college  students; 
and  a  related,  limited  inclusion  of  underrepresented  and  minority  groups, 
such  as  Hispanic  Americans  and  women.  Diversity  characteristics  are  an 
important  issue  in  terms  of  external  validity,  and  they  can  have  important 
and  far-reaching  consequences  for  all  strata  of  society.  For  example,  the 
results  of  a  medication  effectiveness  study  conducted  only  on  White 
males  might  not  hold  true  fora  different  racial  group.The  possible  ramifi¬ 
cations  should  be  obvious.  Similarly,  a  study  designed  to  provide  informa¬ 
tion  needed  to  make  an  important  public  policy  decision  should  include  a 
sample  diverse  enough  to  accurately  capture  the  particular  group  that  will 
be  directly  impacted  by  the  decision.  Although  these  are  only  two  ex¬ 
amples,  diversity  factors  should  be  considered  in  all  types  of  research. 
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drawn  from  the  results  of  a  study  tend  to  be  limited  to  the  characteristics 
represented  by  the  sample  used  in  the  study. 

Stimulus  Characteristics  and  Settings 

This  threat  to  external  validity  refers  to  an  environmental  phenomenon  in 
which  particular  features  or  conditions  of  the  study  limit  the  generaliz- 
ability  of  the  findings  (Brunswik,  1955;  Pedhazur  &  Schmelkin,  1991). 
Every  study  operates  under  a  unique  set  of  conditions  and  circumstances 
related  to  the  experimental  arrangement.  The  most  commonly  cited  ex¬ 
amples  include  the  research  setting  and  the  researchers  involved  in  the 
study.  The  major  concern  with  this  threat  to  external  validity  is  that  the 
findings  from  one  study  are  influenced  by  a  set  of  unique  conditions,  and 
thus  may  not  necessarily  generalize  to  another  study,  even  if  the  other 
study  uses  a  similar  sample. 

Let’s  return  again  to  our  previous  example  involving  the  intervention 
for  test  anxiety.  That  study  found  that  the  intervention  was  effective  for 
test  anxiety  with  college  freshmen  enrolled  in  an  introductory-level  psy¬ 
chology  class  at  a  major  East  Coast  university.  A  colleague  at  a  West  Coast 
university  decides  to  replicate  the  study  using  a  sample  of  college  fresh¬ 
men  enrolled  in  an  introductory-level  psychology  class.  Despite  following 
our  East  Coast  procedures  to  the  letter,  our  colleague  does  not  find  that 
the  intervention  was  effective.  Although  there  could  be  a  number  of 
explanations  for  this,  it  is  possible  that  a  stimulus-characteristics-and- 
settings  confound  is  present.  The  setting  where  the  intervention  is  deliv¬ 
ered  is  no  doubt  different  at  our  West  Coast  colleague’s  university — for 
example,  it  could  be  less  comfortable  than  our  East  Coast  setting.  Simi¬ 
larly,  a  different  individual  is  delivering  the  intervention  to  the  college 
freshmen  on  the  West  Coast,  and  this  individual  might  be  less  competent 
or  less  approachable  than  his  or  her  East  Coast  counterpart.  Each  of  these 
is  an  example  of  potential  sources  of  stimulus  characteristics  and  settings. 

Reactivity  of  the  Experimental  Arrangements 

This  threat  to  external  validity  refers  to  a  potentially  confounding  variable 
that  is  a  result  of  the  influence  produced  by  knowing  that  one  is  partici¬ 
pating  in  a  research  study  (Christensen,  1988).  In  other  words,  the  partic- 
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ipants’  awareness  that  they  are  taking  part  in  a  study  can  have  an  impact  on 
their  attitudes  and  behavior  during  the  course  of  the  study.  This,  in  turn, 
can  have  a  significant  impact  on  any  results  obtained  from  the  study  and  is 
especially  problematic  when  participants  know  the  purpose  or  hypotheses 
of  the  study.  We  discussed  strategies  for  limiting  participants’  knowledge 
about  a  study’s  hypotheses  in  Chapter  3.  As  a  threat  to  external  validity,  the 
issue  becomes  whether  the  same  results  would  have  been  obtained  had  the 
participants  been  unaware  that  they  were  being  studied  (Kazdin,  2003c). 
This  threat  to  external  validity  is  a  very  common  one.  The  primary  reason 
for  this  is  that  ethical  standards  require  that  participants  provide  informed 
consent  before  participating  in  most  research  studies. 

For  example,  let’s  consider  a  study  designed  to  evaluate  the  effective¬ 
ness  of  a  10-week  behavior  modification  program  devised  to  reduce  re¬ 
cidivism  in  adolescent  offenders.  The  experimental  group  receives  the 
intervention  (i.e.,  the  independent  variable)  and  the  control  group  does  not. 
The  researchers  find  that  the  experimental  group  shows  lower  levels  of 
recidivism  (i.e.,  the  dependent  variable)  when  compared  to  the  control 
group.  The  researchers  might  be  tempted  to  say  that  the  intervention  was 
responsible  for  the  findings;  however,  it  might  be  that  the  behavior  in 
question  improved  because  the  participants  had  assumed  a  compliant  at¬ 
titude  toward  the  intervention.  Alternatively,  if  the  participants  in  the 
treatment  group  had  adopted  a  more  negativistic  attitude  toward  the  inter¬ 
vention,  the  results  of  the  study  might  have  suggested  that  the  interven¬ 
tion  was  not  successful.  In  any  event,  either  outcome  might  be  the  result 
of  reactivity  to  the  experimental  arrangements  and  not  the  interven¬ 
tion  itself. 

Multiple-Treatment  Interference 

This  threat  to  external  validity  refers  to  research  situations  in  which  (1) 
participants  are  administered  more  than  one  experimental  intervention 
(or  independent  variable)  within  the  same  study  or  (2)  the  same  individu¬ 
als  participate  in  more  than  one  study  (Pedhazur  &  Schmelkin,  1991).  Al¬ 
though  it  is  most  common  in  treatment-outcome  studies,  it  is  also  preva¬ 
lent  in  any  study  that  has  more  than  one  experimental  condition  or 
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independent  variable.  The  major  implication  of  this  threat  is  that  the  re¬ 
search  results  may  be  due  to  the  context  or  series  of  conditions  in  which 
the  research  presented  (Kazdin,  2003c). 

In  the  first  research  situation,  independent  variables  administered  si¬ 
multaneously  or  sequentially  may  produce  an  interaction  effect.  In  gen¬ 
eral,  multiple  independent  variables  administered  in  the  same  study  act  as 
a  confound  that  makes  it  difficult  to  determine  which  one  is  responsible 
for  the  observed  results.  The  second  situation  refers  to  the  relative  expe¬ 
rience  and  sophistication  of  the  participants.  Familiarity  with  research  can 
affect  the  behavior  and  responses  of  participants,  which  again  makes  it  dif¬ 
ficult  to  accurately  interpret  the  results  of  the  study. 

For  example,  let’s  consider  a  common  situation  in  which  multiple- 
treatment  interference  can  occur.  A  12-week  treatment  study  is  designed 
to  assess  the  effectiveness  of  a  combined  approach  to  treating  depression 
that  encompasses  elements  of  both  psychodynamic  and  cognitive  therapy. 
The  participants  are  randomly  divided  into  a  control  group  and  an  experi¬ 
mental  group.  Both  groups  are  assessed  to  determine  symptom  severity. 
The  experimental  group  then  receives  6  weeks  of  psychodynamic  therapy 
followed  by  6  weeks  of  cognitive  therapy.  At  the  end  of  12  weeks,  both  the 
control  and  experimental  groups  are  reassessed  for  symptom  severity.  The 
results  of  the  assessment  suggest  that  the  experimental  group  experienced 
significant  symptom  reduction  while  the  control  group  did  not.  The  re¬ 
searchers  conclude  that  a  combined  psychodynamic— cognitive  therapy 
model  is  an  effective  approach  to  treating  depression. 

Although  this  may  indeed  be  the  case,  it  is  far  from  a  certainty  and  there 
are  many  unanswered  questions.  For  example,  would  the  treatment  have 
been  as  effective  if  the  cognitive  therapy  had  been  administered  first? 
Would  6  weeks  of  psychodynamic  or  cognitive  therapy  alone  have  pro¬ 
duced  similar  results?  Did  the  presence  of  both  treatment  modalities  ac¬ 
tually  reduce  the  effectiveness  of  the  overall  intervention?  Although  the 
study  produced  significant  symptom  improvements,  it  might  have  pro¬ 
duced  even  better  results  if  both  forms  of  therapy  had  not  been  used. 
These  are  aspects  of  multiple-treatment  effects  that  are  best  controlled  for 
through  specific  research  designs  that  were  discussed  in  Chapter  5. 
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Novelty  Effects 

This  threat  to  external  validity 
refers  to  the  possibility  that  the  ef¬ 
fects  of  the  independent  variable 
may  be  due  in  part  to  the  unique¬ 
ness  or  novelty  of  the  stimulus  or 
situation  and  not  to  the  interven¬ 
tion  itself.  It  is  similar  to  the 
Hawthorne  effect  (discussed  in 
Chapter  3;  see  also  Rapid  Refer¬ 
ence  6.3)  in  that  new  or  unusual 
treatments  or  experimental  inter¬ 
ventions  might  produce  results 
that  disappear  once  the  novelty  of 
the  situation  or  condition  wears 
off.  In  other  words,  the  novelty  of 
the  intervention  or  situation  acts  as  a  confounding  variable,  and  it  is  that 
novelty  (and  not  the  independent  variable)  that  is  the  real  explanation  for 
the  results.  This  threat  to  external  validity  is  common  across  a  wide  vari¬ 
ety  of  settings  and  experimental  designs. 

Take,  for  example,  a  situation  in  which  researchers  are  trying  to  deter¬ 
mine  the  effectiveness  of  a  new  therapy  intervention  for  individuals  with 
a  history  of  chronic  depression.  They  have  decided  to  call  this  new  inter¬ 
vention  “smile  therapy”  because  the  therapist  is  trained  to  smile  at  the 
client  on  a  regular  schedule  in  the  hope  of  encouraging  a  positive  mood 
and  outlook  on  life.  Symptoms  of  depression  are  assessed,  and  then  the 
participants  are  randomly  assigned  to  either  a  control  group  or  one  of 
three  experimental  conditions.  The  three  experimental  conditions  include 
smile  therapy,  cognitive-behavioral  therapy,  and  interpersonal  therapy.  All 
of  the  participants  undergo  their  respective  treatments  for  4  weeks  and  are 
then  reassessed  for  severity  of  depression.  The  researchers  find  that  smile 
therapy  is  more  effective  than  both  cognitive-behavioral  and  interpersonal 
therapy  on  symptoms  of  chronic  depression. 

By  now,  you  have  likely  figured  out  that  there  might  be  a  problem  here 
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The  Hawthorne  Effect 

Reactivity  of  the  experimental 
arrangements  is  also  referred  to  as 
the  Hawthorne  effect,  which  occurs 
when  an  individual’s  performance 
in  a  study  is  affected  by  the  individ¬ 
ual’s  knowledge  that  he  or  she  is 
participating  in  a  study.  For  ex¬ 
ample,  some  participants  might  be 
more  attentive,  compliant,  or  dili¬ 
gent,  while  others  might  be  inten¬ 
tionally  difficult  or  noncooperative 
despite  having  volunteered  for  the 
study  (Bracht  &  Glass,  1 968). 
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because  a  novelty  effect  could  also  account  for  the  results.  Our  population 
in  this  fictitious  study  consists  of  individuals  with  chronic  depression,  so 
it  is  likely  that  they  have  tried  many  treatment  modalities  or  at  least  been 
in  treatment  in  one  modality  for  a  significant  period  of  time.  Although 
these  modalities  are  somewhat  distinct,  none  of  them  involves  the  thera¬ 
pist  smiling  at  the  participant  as  the  intervention.  The  smile  therapy  is 
therefore  unique,  or  novel,  and  this  alone  might  account  for  the  improve¬ 
ments  in  depression.  The  other  issue  here  is  that  the  intervention  took 
place  over  the  course  of  4  weeks.  If  these  findings  were  the  result  of  a  nov¬ 
elty,  then  we  would  expect  the  treatment  effect  to  disappear  over  time  as 
the  novelty  of  the  smile  therapy  diminished.  Four  weeks  might  not  be  a 
sufficient  amount  of  time  for  the  novelty  to  diminish,  and  the  results  of 
the  study  at  12  weeks  might  not  have  demonstrated  a  significant  finding 
for  this  new  form  of  therapy.  The  presence  of  a  novelty  effect  would  limit 
the  researcher’s  ability  to  generalize  the  results  of  this  study  to  situations 
or  context  in  which  the  same  effect  does  not  exist. 

This  effect  can  also  be  seen  outside  the  treatment-intervention  arena. 
Suppose  you  wanted  to  determine  the  effectiveness  of  an  intervention  de¬ 
signed  to  increase  teamwork  and  related  productivity  for  top-level  man¬ 
agers  in  two  distinct  organizational  settings.  Putting  aside  the  obvious 
threats  to  internal  validity  created  by  conducting  your  study  without  ran¬ 
domization  in  two  separate  environments,  let’s  further  explore  the  impli¬ 
cations  of  the  novelty  effect.  The  researchers  identify  the  top  managers  in 
both  organizations  and  administer  the  intervention.  One  organization  is  a 
manufacturing  company  and  the  other  is  a  large  financial  management 
firm.  The  researchers  find  that  the  intervention  increases  productivity  and 
teamwork,  but  only  in  the  financial  management  firm.  The  researchers 
therefore  conclude  that  the  intervention  is  effective,  but  only  in  the  one 
environment. 

It  is  also  possible,  however,  that  the  finding  is  due  to  a  novelty  effect  and 
not  to  the  intervention  itself.  Let’s  add  some  additional  relevant  informa¬ 
tion.  What  if  you  knew  that  the  manufacturing  company  was  engaged  in  a 
total  quality  improvement  program?  These  programs  tend  to  involve  a 
high  level  of  teamwork  and  group  interaction  on  a  daily  basis.  You  also  dis- 
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cover  that  the  financial  management  firm  has  never  addressed  the  issue  of 
teamwork  or  group  productivity  in  the  past.  Therefore,  the  significant 
finding  might  be  due  to  the  novelty  of  introducing  teamwork  into  a  setting 
where  it  had  never  previously  been  considered,  and  not  to  the  teamwork 
intervention  itself.  Conversely,  the  intervention  might  not  have  been  ef¬ 
fective  in  the  manufacturing  company  because  the  organization  had  al¬ 
ready  incorporated  the  model  into  their  corporate  culture.  What  if  we  tried 
the  intervention  in  a  financial  management  firm  that  had  already  imple¬ 
mented  a  team  approach?  Again,  we  might  find  that  the  intervention  is  not 
effective.  If  that  were  indeed  the  case,  then  in  terms  of  generalizability, 
the  more  accurate  statement  might  be  that  the  intervention  is  effective  in 
financial  management  companies  that  have  never  been  exposed  to  team¬ 
building  interventions. 

Reactivity  of  Assessment 

This  threat  to  external  validity  refers  to  a  phenomenon  whereby  partici¬ 
pants’  awareness  that  their  performance  is  being  measured  can  alter  their 
performance  from  what  it  would  otherwise  have  been  (Christensen,  1988; 
Kazdin,  2003c).  Reactivity  is  a  threat  to  external  validity  when  this  aware¬ 
ness  leads  study  participants  to  respond  differently  than  they  normally 
would  in  the  face  of  experimental  conditions. 

Reactivity  is  another  common  threat  to  external  validity  that  can  occur 
across  a  wide  variety  of  environments  and  circumstances,  and  it  is  a  sub¬ 
stantial  threat  whenever  formal  or  informal  assessment  is  a  necessary 
component  of  the  study.  For  example,  consider  a  psychotherapy  outcome 
study  where  participants  are  assessed  for  number  and  severity  of  symp¬ 
toms  of  emotional  distress.  The  very  fact  that  an  assessment  is  taking  place 
might  cause  the  participants  to  distort  their  responses  for  a  variety  of 
reasons.  For  example,  participants  might  feel  uncomfortable  or  self- 
conscious  and  underreport  their  symptoms.  Conversely,  participants 
might  overreport  their  symptom  levels  if  they  suspect  that  doing  so  might 
lead  to  more  intensive  treatment.  Rapid  Reference  6.4  discusses  the  ob¬ 
trusiveness  of  the  measurement  process  with  regard  to  participant  reac¬ 
tivity. 
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Obtrusive  vs.  Unobtrusive  Measurement 

As  mentioned  previously,  reactivity  becomes  a  threat  to  external  validity 
when  participants  in  a  study  respond  differently  than  they  normally  would 
in  the  face  of  experimental  conditions.  Although  a  wide  variety  of  stimuli 
can  cause  reactivity,  the  most  common  example  occurs  during  formal 
measurement  or  assessment.  If  participants  are  aware  that  they  are  being 
assessed,  then  that  assessment  measure  is  said  to  be  obtrusive  and  there¬ 
fore  likely  to  affect  behavior  Conversely,  the  term  unobtrusive  measure¬ 
ment  refers  to  assessment  in  which  the  participants  are  unaware  that  the 
measurement  is  taking  place  (Rosnow  &  Rosenthal,  2002). 


Although  reactivity  is  common  in  all  forms  of  medical  and  psycholog¬ 
ical  treatment  intervention  studies,  it  is  prevalent  in  other  settings  as  well. 
For  example,  directly  asking  employees  about  their  attitudes  toward  man¬ 
agement  might  lead  to  more  favorable  responses  than  might  otherwise  be 
expected  if  they  filled  out  an  anonymous  questionnaire. 

Pretest  and  Posttest  Sensitisation 

These  related  threats  to  external  validity  refer  to  the  effects  that  pretesting 
and  posttesting  might  have  on  the  behavior  and  responses  of  the  partici¬ 
pants  in  a  study  (Bracht  &  Glass,  1968;  Lana,  1969;  Pedhazur  & 
Schmelkin,  1991).  In  many  forms  of  research,  participants  are  pretested  to 
quantify  the  presence  of  some  variable  of  interest  and  to  provide  a  base¬ 
line  of  behavior  against  which  the  effects  of  the  experimental  intervention 
(independent  variable)  can  be  evaluated.  For  example,  a  pretest  for  symp¬ 
toms  of  anxiety  would  be  given  to  determine  participant  symptomology  in 
a  treatment  study  investigating  the  effectiveness  of  a  new  therapy  for  anx¬ 
iety  disorders.  The  pretest  information  would  be  used  as  a  baseline  mea¬ 
sure  and  compared  to  a  posttest  measure  of  symptoms  at  the  conclusion 
of  the  study  to  determine  the  intervention’s  effectiveness  at  reducing 
symptoms  of  anxiety.  Generally,  pretest  sensitization  is  a  possibility  when¬ 
ever  participants  are  measured  prior  to  the  administration  of  the  experi- 
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mental  intervention  and  the  researchers  are  interested  in  measuring  the  ef¬ 
fects  of  the  independent  variable  on  the  dependent  variable. 

As  a  threat  to  external  validity,  the  concern  is  that  exposure  to  the 
pretest  may  contribute  to,  or  be  the  sole  cause  of,  the  observed  changes  in 
the  dependent  variable.  In  other  words,  would  the  results  of  the  study  have 
been  the  same  if  the  pretest  had  not  been  administered?  This  has  obvious 
implications  for  external  validity  because  pretest  sensitization  might  ren¬ 
der  the  results  irrelevant  in  situations  in  which  the  same  pretest  was  not  ad¬ 
ministered.  For  example,  in  our  previously  mentioned  anxiety  study  the 
same  treatment  effects  might  not  be  found  in  the  absence  of  the  pretest 
for  current  level  of  anxiety. 

Whereas  pretesting  is  focused  on  assessing  the  level  of  a  variable  before 
application  of  the  experimental  intervention  (or  independent  variable), 
posttestingh  conducted  to  assess  the  effectiveness  of  the  independent  vari¬ 
able.  A  posttest  measurement  can  have  a  similar  effect  on  external  validity 
as  a  pretest  assessment.  Would  the  same  results  have  been  found  if  the 
posttest  had  not  been  administered?  If  not,  then  it  can  be  said  that  posttest 
sensitization  might  account  for  the  results  either  alone  or  in  combination 
with  the  experimental  intervention. 

In  both  pre-  and  postassessment,  the  concern  is  whether  participants 
were  sensitized  by  either  measure.  If  so,  the  findings  might  be  less  gener- 
alizable  than  if  future  research  and  actual  interventions  were  conducted 
without  the  same  procedure  and  assessment  measures.  In  other  words,  the 
presence  of  pre-  and  posttesting  becomes  an  integral  part  of  the  interven¬ 
tion  itself.  Therefore,  the  effects  of  the  independent  variable  might  be  less 
prominent  or  even  nonexistent  in  the  absence  of  pretest  or  posttest  sensi¬ 
tization. 

Timing  of  Assessment  and  Measurement 

This  threat  to  external  validity  is  particularly  common  in  longitudinal 
forms  of  research,  and  it  refers  to  the  question  of  whether  the  same  results 
would  have  been  obtained  if  measurement  had  occurred  at  a  different 
point  in  time  (Kazdin,  2003c). 

Although  this  threat  to  external  validity  can  occur  in  most  types  of  re- 
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search  design,  it  is  most  common  in  longitudinal  research.  (See  Chapter  5 
for  a  more  detailed  discussion  of  longitudinal  research.)  Longitudinal  re¬ 
search  occurs  over  time  and  is  characterized  by  multiple  assessments  over 
the  duration  of  the  study.  For  example,  a  longitudinal  therapy  outcome 
study  might  find  significant  results  after  assessment  of  symptoms  at  2 
months,  but  not  at  4  or  6  months.  If  the  study  concluded  at  the  end  of  2 
months,  the  researchers  might  come  to  the  general  conclusion  that  the 
treatment  is  effective  for  a  particular  disorder.  This  might  be  an  overgen¬ 
eralization  because  if  the  study  had  continued  for  a  longer  period  of  time, 
the  same  treatment  effect  would  not  have  been  observed.  Thus,  the  more 
appropriate  conclusion  about  our  2-month  study  might  be  that  the  treat¬ 
ment  produces  symptom  relief  for  up  to  or  after  2  months.  The  more  spe¬ 
cific  conclusion  is  supported  by  the  study,  while  the  more  general  conclu¬ 
sion  about  effectiveness  might  not  be  accurate  due  to  the  timing  of 
measurement.  Bear  in  mind  that  the  reverse  might  also  be  true:  A  lack  of 
significant  findings  after  measurement  at  2  months  does  not  eliminate  the 
possibility  of  significant  results  if  the  intervention  and  measurement  oc¬ 
curred  over  a  longer  period  of  time. 

Rapid  Reference  6.5  summarizes  the  threats  to  external  validity  we 
have  discussed  in  this  section,  and  Rapid  Reference  6.6  provides  further 
discussion. 


CONSTRUCT  VALIDITY 

In  the  context  of  research  design  and  methodology,  the  term  construct  va¬ 
lidity  relates  to  interpreting  the  basis  of  the  causal  relationship,  and  it  refers 
to  the  congruence  between  the  study’s  results  and  the  theoretical  under¬ 
pinnings  guiding  the  research  (Kazdin,  2003c).  The  focus  of  construct  va¬ 
lidity  is  usually  on  the  study’s  independent  variable.  In  essence,  construct 
validity  asks  the  question  of  whether  the  theory  supported  by  the  findings 
provides  the  best  available  explanation  of  the  results.  In  other  words,  is  the 
reason  for  the  relationship  between  the  experimental  intervention  (inde¬ 
pendent  variable)  and  the  observed  phenomenon  (dependent  variable) 
due  to  the  underlying  construct  or  explanation  offered  by  the  researchers 
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Threats  to  External  Validity 

•  Sample  characteristics:The  extent  to  which  the  results  of  a  study 
apply  only  to  a  particular  sample. The  key  question  is  whether  the 
study's  results  can  be  applied  to  other  samples  that  vary  on  a  variety  of 
demographic  and  descriptive  characteristics,  such  as  age,  gender,  sexual 
orientation,  education,  and  socioeconomic  status. 

•  Stimulus  characteristics  and  settings:  An  environmental  phe¬ 
nomenon  whereby  particular  features  or  conditions  of  the  study  limit 
the  generalizability  of  the  findings  so  that  the  findings  from  one  study 
do  not  necessarily  apply  to  another  study,  even  if  the  other  study  is  us¬ 
ing  a  similar  sample. 

•  Reactivity  of  experimental  arrangements:  A  potentially  con¬ 
founding  variable  that  results  from  the  influence  produced  by  knowing 
that  one  is  participating  in  a  research  study. 

•  Multiple-treatment  interference:This  threat  refers  to  research 
situations  in  which  (I)  participants  are  administered  more  than  one  ex¬ 
perimental  intervention  within  the  same  study  or  (2)  the  same  individu¬ 
als  participate  in  more  than  one  study. 

•  Novelty  effects:This  refers  to  the  possibility  that  the  effects  of  the  in¬ 
dependent  variable  may  be  due  in  part  to  the  uniqueness  or  novelty  of 
the  stimulus  or  situation  and  not  to  the  intervention  itself 

•  Reactivity  of  assessment:  A  phenomenon  whereby  participants’ 
awareness  that  their  performance  is  being  measured  can  alter  their 
performance  from  what  it  otherwise  would  have  been. 

•  Pretest  and  posttest  sensitization:  These  threats  refer  to  the  ef¬ 
fects  that  pretesting  and  posttesting  might  have  on  the  behavior  and  re¬ 
sponses  of  study  participants. 

•  Timing  of  assessment  and  measurement: This  threat  refers  to 
whether  the  same  results  would  have  been  obtained  if  measurement 
had  occurred  at  a  different  point  in  time. 
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Importance  of  Interaction  Effects  in  Relation  to 
External  Validity 

External  validity  can  best  be  understood  as  an  interaction  between  partic¬ 
ipant  attributes  and  experimental  settings  and  their  related  characteristics. 
Generalization  of  results  from  any  study  is  hampered  when  the  indepen¬ 
dent  variable  interacts  with  participant  attributes  or  characteristics  of  the 
experimental  setting  to  produce  the  observed  results. Therefore,  the 
types  of  threats  to  external  validity  discussed  in  this  chapter  are  far  from 
exhaustive.  Depending  on  the  experimental  design  and  the  research  ques¬ 
tion,  each  study  can  create  unique  threats  to  external  validity  that  should 
be  controlled  for.  If  experimental  control  is  not  possible,  the  limitations  of 
the  study's  findings  should  be  discussed  in  sufficient  detail  to  clarify  the 
relevance  and  general  izability  of  the  findings. 

(Campbell  &  Stanley,  1966;  Cook  &  Campbell,  1979;  Christensen,  1988; 
Graziano  &  Raulin,  2004;  Kazdin,  2003c)? 

There  are  two  primary  methods  for  improving  the  construct  validity  of 
a  study.  First,  strong  construct  validity  is  based  on  clearly  stated  and  accu¬ 
rate  operational  definitions  of  a  study’s  variables.  Second,  the  underlying 
theory  of  the  study  should  have  a  strong  conceptual  basis  and  be  based  on 
well-validated  constructs  (Graziano  &  Raulin,  2004).  Cook  and  Campbell 
(1979)  suggest  several  ways  to  improve  construct  validity;  these  are  listed 
in  Rapid  Reference  6.7. 

Let’s  consider  a  straightforward  example  to  illustrate  the  importance  of 
construct  validity  in  a  study.  A  team  of  researchers  is  interested  in  study¬ 
ing  the  factors  that  contribute  to  mortality  rates  in  a  number  of  different 
countries.  The  scope  of  the  study  prohibits  the  use  of  actual  participants, 
so  the  researchers  decide  to  conduct  a  correlational  study  in  which  they 
analyze  the  statistical  relationships  between  different  countries  and  avail¬ 
able  demographic  data.  The  researchers  hypothesize  that  education  level 
and  family  income  will  be  significantly  related  to  mortality  rate.  The  spe¬ 
cific  hypothesis  is  that  mortality  rate  will  drop  as  education  level  and 
family  income  rise.  In  other  words,  the  researchers  are  hypothesizing  that 
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Improving  Construct  Validity 

Cook  and  Campbell  ( 1 979)  make  the  following  suggestions  for  improving 

construct  validity: 

•  Provide  a  clear  operational  definition  of  the  abstract  concept  or  inde¬ 
pendent  variable. 

•  Collect  data  to  demonstrate  that  the  empirical  representation  of  the 
independent  variable  produces  the  expected  outcome. 

•  Collect  data  to  show  that  the  empirical  representation  of  the  indepen¬ 
dent  variable  does  not  vary  with  measures  of  related  but  different  con¬ 
ceptual  variables. 

•  Conduct  manipulation  checks  of  the  independent  variable. 

there  is  a  negative  relationship  between  mortality  and  education  level  and 
family  income.  The  underlying  construct  being  tested  in  the  study  is  that 
these  two  factors — education  level  and  family  income — are  negatively  re¬ 
lated  to  mortality.  The  researchers  conduct  their  analyses  and  discover  that 
their  hypothesis  is  confirmed — that  is,  that  mortality  rates  are  negatively 
related  to  education  level  and  family  income.  The  researchers  conclude 
that  educational  level  and  family  income  are  protective  factors  that  reduce 
the  likelihood  of  mortality. 

Is  this  the  most  likely  explanation  for  the  results,  or  is  there  perhaps 
a  better  explanation  that  might  function  as  a  threat  to  the  study’s  hypo¬ 
thesis  regarding  causation  (or  construct  validity)?  What  might  be  a  better 
causal  explanation  for  the  results  of  the  study?  One  possible  alternative  ex¬ 
planation  of  the  results  might  be  that  higher  educational  levels  and  family 
income  reduce  mortality  rates  because  they  are  related  to  another  factor 
that  was  not  considered  in  the  study.  Considering  that  educational  level  is 
usually  positively  related  to  income  level,  higher  levels  of  education  tend 
to  lead  to  higher  levels  of  income.  A  higher  level  of  income  usually  pro¬ 
vides  access  to  a  wider  variety  of  privileges  and  services,  such  as  access  to 
higher- quality  health  care.  Access  to  health  care  is  therefore  related  to  ed¬ 
ucation  level  and  family  income,  and  it  is  a  plausible  causal  explanation  for 
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the  results  obtained  in  the  study 
(other  than  those  espoused  by  the 
researchers). 

There  are  phenomena  that  oc¬ 
cur  within  the  context  of  research 
that  can  act  as  threats  to  construct 
validity.  As  with  internal  and  ex¬ 
ternal  validity,  the  number  and 
types  of  threats  are  related  to  the 
unique  aspects  and  design  of  the 
study  itself.  Generally,  these  threats  are  features  of  a  study  that  interfere 
with  the  researcher’s  ability  to  draw  causal  inferences  from  the  study’s  re¬ 
sults  (Kazdin,  2003c).  In  our  previous  discussions  of  internal  and  external 
validity,  we  were  able  to  identify  and  categorize  specific  and  well-defined 
threats.  The  threats  to  construct  validity  are  more  difficult  to  classify  be¬ 
cause  they  can  be  anything  that  relates  to  the  design  of  the  study  and  the 
underlying  theoretical  construct  under  consideration.  Despite  this,  the 
most  common  sources  of  threats  to  construct  validity  closely  parallel 
some  of  the  threats  to  external  validity  discussed  earlier  in  this  chapter 
such  as  conditions  surrounding  the  experimental  situation,  experimenter 
expectancies,  and  characteristics  of  the  participants. 


DOS'  *  T  FORGET 


Threats  to 
Construct  Validity 

Threats  to  construct  validity  relate 
to  the  unique  aspects  and  design 
of  the  study  that  interfere  with  the 
researcher’s  ability  to  draw  causal 
inferences  from  the  study’s  results. 


STATISTICAL  VALIDITY 

The  final  type  of  validity  that  we  will  discuss  in  this  chapter  is  the  critically 
important  yet  often-overlooked  concept  of  statistical  validity.  As  its  name 
implies,  statistical  validity  (also  referred  to  as  statistical  conclusion  validity)  refers 
to  aspects  of  quantitative  evaluation  that  affect  the  accuracy  of  the  con¬ 
clusions  drawn  from  the  results  of  a  study  (Campbell  &  Stanley,  1966; 
Cook  &  Campbell,  1979).  Statistical  procedures  are  typically  used  to  test 
the  relationship  between  two  or  more  variables  and  determine  whether  an 
observed  statistical  effect  is  due  to  chance  or  is  a  true  reflection  of  a  causal 
relationship  (Rosnow  &  Rosenthal,  2002).  At  its  simplest  level,  statistical 
validity  addresses  the  question  of  whether  the  statistical  conclusions 
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drawn  from  the  results  of  a  study  are  reasonable  (Graziano  &  Raulin, 
2004). 

The  concepts  of  hypothesis  testing  and  statistical  evaluation  are  inter¬ 
related,  and  they  provide  the  foundation  for  evaluating  statistical  validity. 
Statistical  evaluation  refers  to  the  theoretical  basis,  rationale,  and  computa¬ 
tional  aspects  of  the  actual  statistics  used  to  evaluate  the  nature  of  the  re¬ 
lationship  between  the  independent  and  dependent  variables.  Among 
other  things,  the  choice  of  statistical  techniques  often  depends  on  the  na¬ 
ture  of  the  hypotheses  being  tested  in  the  study.  This  is  where  the  concept 
of  hypothesis  testing  enters  our  discussion  of  statistical  validity.  Put  simply, 
every  study  is  driven  by  one  or  more  hypotheses  that  guide  the  method¬ 
ological  design  of  the  study,  the  statistical  analyses,  and  the  resulting  con¬ 
clusions. 

As  discussed  in  Chapter  2,  there  are  two  main  types  of  hypotheses  in  re¬ 
search:  the  null  hypothesis  (usually  designated  as  H0)  and  the  experimen¬ 
tal  hypothesis  (usually  designated  as  H1}  H2,  H3,  etc.,  depending  on  the 
number  of  hypotheses).  The  experimental  hypothesis  represents  the  predicted 
relationship  among  the  variables  being  examined  in  the  study.  Conversely, 
the  null  hypothesis  represents  a  statement  of  no  relationship  among  the  vari¬ 
ables  being  examined  (Christensen,  1988). 

At  this  point,  we  should  review  an  important  convention  in  research 
methodology  as  it  relates  to  statistical  analyses  and  hypotheses  testing.  Re¬ 
jecting  the  null  hypothesis  is  a  necessary  first  step  in  evaluating  the  impact 
of  the  independent  variable  (Graziano  &  Raulin,  2004).  Therefore,  in 
terms  of  statistical  analyses,  the  focus  is  always  on  the  null  hypothesis,  and 
not  on  the  experimental  hypotheses.  Researchers  reject  the  null  hypothe¬ 
sis  if  a  statistically  significant  difference  is  found  between  the  experimen¬ 
tal  and  control  conditions  (Kazdin,  2003c).  By  contrast,  researchers  retain 
(or  fail  to  reject)  the  null  hypothesis  if  no  statistically  significant  difference 
is  found  between  the  experimental  and  control  conditions. 

As  with  the  other  forms  of  validity  discussed  throughout  this  chapter, 
there  are  numerous  threats  to  statistical  validity.  The  most  common  in¬ 
clude  low  statistical  power,  variability  in  the  experimental  procedures  and 
participant  characteristics,  unreliability  of  measures,  and  multiple  com- 
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parisons  and  error  rates.  Each  of  these  threats  can  have  a  significant  im¬ 
pact  on  the  study’s  ability  to  delineate  causal  relationships  and  rule  out 
plausible  rival  hypotheses. 

Low  Statistical  Power 

Loir  statistical power  is  the  most  common  threat  to  statistical  validity  (Kep- 
pel,  1991;  Kirk,  1995).  The  presence  of  this  threat  produces  a  low  proba¬ 
bility  of  detecting  a  difference  between  experimental  and  control  condi¬ 
tions  even  when  a  difference  truly  exists.  Low  statistical  power  is  directly 
related  to  small  effect  and  sample  sizes,  with  the  presence  of  each  increas¬ 
ing  the  likelihood  that  low  statistical  power  is  an  issue  in  the  research  de¬ 
sign.  Accordingly,  low  statistical  power  can  cause  a  researcher  to  conclude 
that  there  are  no  significant  results  even  when  significant  results  actually 
exist  (Rosnow  &  Rosenthal,  2002).  The  concept  of  power  will  be  dis¬ 
cussed  further  in  Chapter  7. 


Variability 

Variability  is  another  threat  to  statistical  validity  that  applies  to  both  the 
participants  and  procedures  used  in  a  study.  First,  let’s  consider  variability 
in  methodological procedures.  This  concept  includes  a  wide  array  of  differences 
and  questions  that  relate  to  the  actual  design  aspects  of  the  study.  These 
differences  can  be  found  in  the  delivery  of  the  independent  variable,  the 
procedures  related  to  the  execution  of  the  study,  variability  in  perfor¬ 
mance  measures  over  time,  and  a  host  of  other  examples  that  are  directly 
dependent  on  the  unique  design  of  a  particular  study.  A  related  threat  to 
statistical  validity  is  variability  in  participant  characteristics.  Participants  in  a  re¬ 
search  study  can  vary  along  a  variety  of  characteristics  and  dimensions, 
such  as  age,  education,  socioeconomic  status,  and  race.  As  the  diversity  of 
participant  characteristics  increases,  there  is  less  likelihood  that  a  differ¬ 
ence  between  the  control  and  experimental  conditions  can  be  detected. 
When  variability  across  these  two  broad  sources  is  minimized,  the  likeli- 
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hood  of  detecting  a  true  difference  between  the  control  and  experimental 
conditions  increases.  This  threat  to  statistical  validity  must  be  considered 
at  the  planning  stage  of  the  study,  and  it  is  usually  controlled  through  the 
use  of  homogeneous  samples,  strict  and  well-defined  procedural  proto¬ 
cols,  and  statistical  controls  at  the  data  analysis  stage. 

Unreliability  of  Measures 

Unreliability  of  measures  used  in  a  study  is  another  source  of  variability 
that  is  a  threat  to  statistical  validity.  This  threat  refers  to  whether  the  mea¬ 
sures  used  in  the  study  assess  the  characteristics  of  interest  in  a  consis¬ 
tent — or  reliable — fashion  (Kazdin,  2003c).  If  the  research  study’s  mea¬ 
sures  are  unreliable,  then  more  random  variability  is  introduced  into  the 
experimental  design.  As  with  participant  and  procedural  variability,  this 
type  of  variability  decreases  statistical  power  and  makes  it  less  likely  that 
the  statistical  analyses  will  detect  a  true  difference  between  the  control  and 
experimental  conditions  when  a  difference  actually  exists. 

Multiple  Comparisons 

The  final  threat  to  statistical  validity  that  we  will  consider  is  often  referred 
to  as  multiple  statistical  comparisons  and  the  resulting  error  rates  (Kazdin, 
2003c;  Rosnow  &  Rosenthal,  2002).  This  threat  to  statistical  validity  per¬ 
tains  to  the  number  of  statistical  analyses  used  to  analyze  the  data  obtained 
in  a  study.  Generally,  as  the  number  of  statistical  analyses  increases,  so  does 
the  likelihood  of  finding  a  significant  difference  between  the  experimental 
and  control  conditions  purely  by  mathematical  chance.  In  other  words,  the 
significant  finding  is  a  mathematical  artifact  and  does  not  reflect  a  true  dif¬ 
ference  between  conditions.  Accordingly,  researchers  should  define  their 
hypotheses  before  the  study  begins  so  as  to  conduct  the  minimum  number 
of  statistical  analyses  to  address  each  of  the  hypotheses. 

Rapid  Reference  6.8  summarizes  the  threats  to  statistical  validity  that 
we  have  discussed  in  this  section. 
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— fiap/d Reference  68 


Threats  to  Statistical  Validity 

•  Low  statistical  power:  Low  probability  of  detecting  a  difference  be¬ 
tween  experimental  and  control  conditions  even  if  a  difference  truly 
exists. 

•  Procedural  and  participant  variability:  Variability  in  methodolog¬ 
ical  procedures  and  a  host  of  participant  characteristics,  which  de¬ 
creases  the  likelihood  of  detecting  a  difference  between  the  control 
and  experimental  conditions. 

•  Unreliability  of  measures:  Whetherthe  measures  used  in  a  study 
assess  the  characteristics  of  interest  in  a  consistent  manner.  Unreliable 
measures  introduce  more  random  variability  into  the  research  design, 
which  reduces  statistical  power. 

•  Multiple  comparisons  and  error  rates:The  concept  that,  as  the 
number  of  statistical  analyses  increases,  so  does  the  likelihood  of  finding 
a  significant  difference  between  the  experimental  and  control  condi¬ 
tions  purely  by  chance. 


SUMMARY 

In  this  chapter,  we  have  discussed  the  four  types  of  validity  that  are  criti¬ 
cal  to  sound  research  methodology.  In  addition,  we  discussed  the  major 
threats  to  each  type  of  validity.  Although  each  type  of  validity  and  its  re¬ 
lated  threats  were  presented  independently,  it  is  important  to  note  that  all 
types  of  validity  are  interdependent,  and  addressing  one  type  may  com¬ 
promise  the  other  types.  As  was  discussed,  all  of  the  broad  threats  to  va¬ 
lidity  should  be  considered  at  the  design  stage  of  the  study  if  possible.  In 
terms  of  priority,  ensuring  strong  internal  validity  is  regarded  as  more  im¬ 
portant  than  external  validity,  because  we  must  control  for  rival  hypothe¬ 
ses  before  we  can  even  begin  to  think  about  generalizing  the  results  of  a 
study. 
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TEST  YOURSELF  ^ 


1 .  _ is  an  important  concept  in  research  that  refers  to  the  concep¬ 

tual  and  scientific  soundness  of  a  research  study. 

2.  History,  maturation,  testing,  statistical  regression,  and  selection  biases  are 

threats  to _ . 

3.  External  validity  is  concerned  with  the _ of  research  results. 

4.  _ refers  to  aspects  of  quantitative  evaluation  that  af¬ 

fect  the  accuracy  of  the  conclusions  drawn  from  the  results  of  a  study. 

5.  _ refers  to  the  congruence  between  the  study’s  re¬ 

sults  and  the  theoretical  underpinnings  guiding  the  research. 

Answers:  I  .Validity;  2.  internal  validity;  3.  generalizability;  4.  Statistical  conclusion;  5.  Construct 

validity 
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Seven 


DATA  PREPARATION,  ANALYSES, 
AND  INTERPRETATION 


As  we  have  discussed  in  previous  chapters,  in  most  research  stud¬ 
ies,  the  researcher  begins  by  generating  a  research  question,  fram¬ 
ing  it  into  a  testable  (i.e.,  falsifiable)  hypothesis,  selecting  an  ap¬ 
propriate  research  design,  choosing  a  suitable  sample  of  research 
participants,  and  selecting  valid  and  reliable  methods  of  measurement.  If 
all  of  these  tasks  have  been  carried  out  properly,  then  the  process  of  data 
analysis  should  be  a  fairly  straightforward  process.  Still,  a  variety  of  im¬ 
portant  steps  must  be  taken  to  ensure  the  integrity  and  validity  of  research 
findings  and  their  interpretation. 

In  most  types  of  research  studies,  the  process  of  data  analysis  involves 
the  following  three  steps:  (1)  preparing  the  data  for  analysis,  (2)  analyzing 
the  data,  and  (3)  interpreting  the  data  (i.e.,  testing  the  research  hypotheses 
and  drawing  valid  inferences).  Therefore,  we  will  begin  this  chapter  with  a 
brief  discussion  of  data  cleaning  and  organization,  followed  by  a  nontech¬ 
nical  overview  of  the  most  widely  used  descriptive  and  inferential  statis¬ 
tics.  We  will  conclude  this  chapter  with  a  discussion  of  several  important 
concepts  that  should  be  understood  when  interpreting  and  drawing  infer¬ 
ences  from  research  findings.  Because  a  comprehensive  discussion  of  sta¬ 
tistical  techniques  is  well  beyond  the  scope  of  this  book,  researchers  seek¬ 
ing  a  more  detailed  review  of  statistical  analyses  should  consult  one  of  the 
statistical  textbooks  contained  in  the  reference  list. 


198 
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DATA  PREPARATION 

Virtually  all  studies,  from  surveys  to  randomized  experimental  trials,  re¬ 
quire  some  form  of  data  collection  and  entry.  Data  represent  the  fruit  of 
researchers’  labor  because  they  provide  the  information  that  will  ulti¬ 
mately  allow  them  to  describe  phenomena,  predict  events,  identify  and 
quantify  differences  between  conditions,  and  establish  the  effectiveness  of 
interventions.  Because  of  their  critical  nature,  data  should  be  treated  with 
the  utmost  respect  and  care.  In  addition  to  ensuring  the  confidentiality 
and  security  of  personal  data  (as  discussed  in  Chapter  8),  the  researcher 
should  carefully  plan  how  the  data  will  be  logged,  entered,  transformed  (as 
necessary),  and  organized  into  a  database  that  will  facilitate  accurate  and 
efficient  statistical  analysis. 

Logging  and  Tracking  Data 

Any  study  that  involves  data  collection  will  require  some  procedure  to  log 
the  information  as  it  comes  in  and  track  it  until  it  is  ready  to  be  analyzed. 
Research  data  can  come  from  any  number  of  sources  (e.g.,  personal 
records,  participant  interviews,  observations,  laboratory  reports,  and 
pretest  and  posttest  measures) .  Without  a  well-established  procedure,  data 
can  easily  become  disorganized,  uninterpretable,  and  ultimately  unusable. 

Although  there  is  no  one  definitive  method  for  logging  and  tracking 
data  collection  and  entry,  in  this  age  of  computers  it  might  be  considered 
inefficient  and  impractical  not  to  take  advantage  of  one  of  the  many  avail¬ 
able  computer  applications  to  facilitate  the  process.  Taking  the  time  to  set 
up  a  recruitment  and  tracking  system  on  a  computer  database  (e.g.,  Mi¬ 
crosoft  Access,  Microsoft  Excel,  Claris  FileMaker,  SPSS,  SAS)  will  provide 
researchers  with  up-to-date  information  throughout  the  study,  and  it  will 
save  substantial  time  and  effort  when  they  are  ready  to  analyze  their  data 
and  report  the  findings. 

One  of  the  key  elements  of  the  data  tracking  system  is  the  recruitment 
log.  The  recruitment  log  is  a  comprehensive  record  of  all  individuals  ap¬ 
proached  about  participation  in  a  study.  The  log  can  also  serve  to  record 
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the  dates  and  times  that  potential  participants  were  approached,  whether 
they  met  eligibility  criteria,  and  whether  they  agreed  and  provided  in¬ 
formed  consent  to  participate  in  the  study.  Importantly,  for  ethical  rea¬ 
sons,  no  identifying  information  should  be  recorded  for  individuals  who 
do  not  consent  to  participate  in  the  research  study.  The  primary  purpose 
of  the  recruitment  log  is  to  keep  track  of  participant  enrollment  and  to  de¬ 
termine  how  representative  the  resulting  cohort  of  study  participants  is  of 
the  population  that  the  researcher  is  attempting  to  examine. 

In  some  study  settings,  where  records  are  maintained  on  all  potential 
participants  (e.g.,  treatment  programs,  schools,  organizations),  it  may  be 
possible  for  the  researcher  to  obtain  aggregate  information  on  eligible  in¬ 
dividuals  who  were  not  recruited  into  the  study,  either  because  they  chose 
not  to  participate  or  because  they  were  not  approached  by  the  researcher. 
Importantly,  because  these  individuals  did  not  provide  informed  consent, 
these  data  can  only  be  obtained  in  aggregate,  and  they  must  be  void  of  any 
identifying  information.  Given  this  type  of  aggregate  information,  the  re¬ 
searcher  would  be  able  to  determine  whether  the  study  sample  is  repre¬ 
sentative  of  the  population. 

In  addition  to  logging  client  recruitment,  a  well-designed  tracking  sys- 


DON’T  FORGET 


Record-Keeping  Responsibilities 

The  lead  researcher  (referred  to  as  principal  investigator  in  grant-funded 
research)  is  ultimately  responsible  for  maintaining  the  validity  and  quality 
of  all  research  data,  including  the  proper  training  of  all  research  staff  and 
developing  and  enforcing  policies  for  recording,  maintaining,  and  storing 
data.The  researcher  should  ensure  that 

•  research  data  are  collected  and  recorded  according  to  policy; 

•  research  data  are  stored  in  a  way  that  will  ensure  security  and  confi¬ 
dentiality;  and 

•  research  data  are  audited  on  a  regular  basis  to  maintain  quality  control 
and  identify  potential  problems  as  they  occur 
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tem  can  provide  the  researcher  with  up-to-date  information  on  the  gen¬ 
eral  status  of  the  study,  including  client  participation,  data  collection,  and 
data  entry. 

Data  Screening 

Immediately  following  data  collection,  but  prior  to  data  entry,  the  re¬ 
searcher  should  carefully  screen  all  data  for  accuracy.  The  promptness  of 
these  procedures  is  very  important  because  research  staff  may  still  be  able 
to  recontact  study  participants  to  address  any  omissions,  errors,  or  inac¬ 
curacies.  In  some  cases,  the  research  staff  may  inadvertently  have  failed  to 
record  certain  information  (e.g.,  assessment  date,  study  site)  or  perhaps 
recorded  a  response  illegibly.  In  such  instances,  the  research  staff  may  be 
able  to  correct  the  data  themselves,  if  too  much  time  has  not  elapsed.  Be¬ 
cause  data  collection  and  data  entry  are  often  done  by  different  research 
staff,  it  may  be  more  difficult  and  time  consuming  to  make  such  clarifica¬ 
tions  once  the  information  is  passed  on  to  data  entry  staff. 

One  way  to  simplify  the  data  screening  process  and  make  it  more  time 
efficient  is  to  collect  data  using  computerized  assessment  instruments. 
Computerized  assessments  can  be  programmed  to  accept  only  responses 
within  certain  ranges,  to  check  for  blank  fields  or  skipped  items,  and  even 
to  conduct  cross-checks  between  certain  items  to  identify  potential  in¬ 
consistencies  between  responses.  Another  major  benefit  of  these  pro¬ 
grams  is  that  the  entered  data  can  usually  be  electronically  transferred  into 
a  permanent  database,  thereby  automating  the  data  entry  procedure.  Al¬ 
though  this  type  of  computerization  may,  at  first  glance,  appear  to  be  an 
impossible  budgetary  expense,  it  might  be  more  economical  than  it  seems 
when  one  considers  the  savings  in  staff  time  spent  on  data  screening  and 
entry. 

Whether  it  is  done  manually  or  electronically,  data  screening  is  an  es¬ 
sential  process  in  ensuring  that  data  are  accurate  and  complete.  Generally, 
the  researcher  should  plan  to  screen  the  data  to  make  certain  that  (1)  re¬ 
sponses  are  legible  and  understandable,  (2)  responses  are  within  an  ac- 
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ceptable  range,  (3)  responses  are  complete,  and  (4)  all  of  the  necessary  in¬ 
formation  has  been  included. 


Constructing  a  Database 

Once  data  are  screened  and  all  corrections  are  made,  the  data  should  be 
entered  into  a  well-structured  database.  When  planning  a  study,  the  re¬ 
searcher  should  carefully  consider  the  structure  of  the  database  and  how 
it  will  be  used.  In  many  cases,  it  may  be  helpful  to  think  backward  and  to 
begin  by  anticipating  how  the  data  will  be  analyzed.  This  will  help  the  re¬ 
searcher  to  figure  out  exactly  which  variables  need  to  be  entered,  how  they 
should  be  ordered,  and  how  they  should  be  formatted.  Moreover,  the  sta¬ 
tistical  analysis  may  also  dictate  what  type  of  program  you  choose  for  your 
database.  For  example,  certain  advanced  statistical  analyses  may  require 
the  use  of  specific  statistical  programs. 

While  designing  the  general  structure  of  the  database,  the  researcher 
must  carefully  consider  all  of  the  variables  that  will  need  to  be  entered. 
Forgetting  to  enter  one  or  more  variables,  although  not  as  problematic  as 
failing  to  collect  certain  data  elements,  will  add  substantial  effort  and  ex¬ 
pense  because  the  researcher 
must  then  go  back  to  the  hard  data 
to  find  the  missing  data  elements. 


The  Data  Codebook 

In  addition  to  developing  a  well- 
structured  database,  researchers 
should  take  the  time  to  develop  a 
data  codebook.  A  data  codebook  is  a 
written  or  computerized  list  that 
provides  a  clear  and  comprehen¬ 
sive  description  of  the  variables 
that  will  be  included  in  the  data¬ 
base.  A  detailed  codebook  is  es- 


DOIPT  FORGET 


Retaining  Data  Records 

Researchers  should  retain  study 
data  for  a  minimum  period  of  5 
years  after  publication  of  their  data 
in  the  event  that  questions  or  con¬ 
cerns  arise  regarding  the  findings. 
The  advancement  of  science  relies 
on  the  scientific  community’s  over¬ 
all  confidence  in  disseminated 
findings,  and  the  existence  of  the 
primary  data  serves  to  instill  such 
confidence. 
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sential  when  the  researcher  begins  to  analyze  the  data.  Moreover,  it  serves 
as  a  permanent  database  guide,  so  that  the  researcher,  when  attempting  to 
reanalyze  certain  data,  will  not  be  stuck  trying  to  remember  what  certain 
variable  names  mean  or  what  data  were  used  for  a  certain  analysis.  Ulti¬ 
mately,  the  lack  of  a  well-defined  data  codebook  may  render  a  database 
uninterpretable  and  useless.  At  a  bare  minimum,  a  data  codebook  should 
contain  the  following  elements  for  each  variable: 

•  Variable  name 

•  Variable  description 

•  Variable  format  (number,  data,  text) 

•  Instrument  or  method  of  collection 

•  Date  collected 

•  Respondent  or  group 

•  Variable  location  (in  database) 

•  Notes 


Data  Entry 

After  the  data  have  been  screened  for  completeness  and  accuracy,  and  the 
researcher  has  developed  a  well-structured  database  and  a  detailed  code- 


DOIT’T  FORGET 


Defining  Variables  Within  a  Database 

Certain  databases,  particularly  statistical  programs  (e.g.,  SPSS)  allow  the 
researcher  to  enter  a  wide  range  of  descriptive  information  about  each 
variable,  including  the  variable  name,  the  type  of  data  (e.g.,  numeric,  text, 
currency,  date),  label  (how  it  will  be  referred  to  in  data  printouts),  how 
missing  data  are  coded  ortreated,  and  measurement  scale  (e.g.,  nominal, 
ordinal,  interval,  or  ratio).  Although  these  databases  are  extremely  helpful 
and  should  be  used  whenever  possible,  they  do  not  substitute  for  a  com¬ 
prehensive  codebook,  which  includes  separate  information  about  the  dif¬ 
ferent  databases  themselves  (e.g.,  which  databases  were  used  for  each  set 
of  analyses). 


term  LinG  -  Live,  informative.  Non-cost  and  Genuine  i 


204  ESSENTIALS  OF  RESEARCH  DESIGN  AND  METHODOLOGY 


book,  data  entry  should  be  fairly  straightforward.  Nevertheless,  many  er¬ 
rors  can  occur  at  this  stage.  Therefore,  it  is  critical  that  all  data-entry  staff 
are  properly  trained  and  maintain  the  highest  level  of  accuracy  when  in¬ 
putting  data.  One  way  of  ensuring  the  accuracy  of  data  entry  is  through 
double  entry.  In  the  double-entry  procedure,  data  are  entered  into  the  data¬ 
base  twice  and  then  compared  to  determine  whether  there  are  any  dis¬ 
crepancies.  The  researcher  or  data  entry  staff  can  then  examine  the  dis¬ 
crepancies  and  determine  whether  they  can  be  resolved  and  corrected  or 
if  they  should  simply  be  treated  as  missing  data.  Although  the  double¬ 
entry  process  is  a  very  effective  way  to  identify  entry  errors,  it  may  be  dif¬ 
ficult  to  manage  and  may  not  be  time  or  cost  effective. 

As  an  alternative  to  double  entry,  the  researcher  may  design  a  standard 
procedure  for  checking  the  data  for  inaccuracies.  Such  procedures  typi¬ 
cally  entail  a  careful  review  of  the  inputted  data  for  out-of-range  values, 
missing  data,  and  incorrect  formatting.  Much  of  this  work  can  be  accom¬ 
plished  by  running  descriptive  analyses  and  frequencies  on  each  variable. 
In  addition,  many  database  programs  (e.g.,  Microsoft  Excel,  Microsoft 
Access,  SPSS)  allow  the  researcher  to  define  the  ranges,  formats,  and  types 
of  data  that  will  be  accepted  into  certain  data  fields.  These  databases  will 
make  it  impossible  to  enter  information  that  does  not  meet  the  preset  cri¬ 
teria.  Defining  data  entry  criteria  in  this  manner  can  prevent  many  errors 
and  it  may  substantially  reduce  the  time  spent  on  data  cleaning. 


Transforming  Data 

After  the  data  have  been  entered  and  checked  for  inaccuracies,  the  re¬ 
searcher  or  data  entry  staff  will  undoubtedly  be  required  to  make  certain 
transformations  before  the  data  can  be  analyzed.  These  transformations 
typically  involve  the  following: 

•  Identifying  and  coding  missing  values 

•  Computing  totals  and  new  variables 

•  Reversing  scale  items 

•  Recoding  and  categorization 
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Identifying  and  Coding  Missing  Values 

Inevitably,  all  databases  and  most  variables  will  have  some  number  of 
missing  values.  This  is  a  result  of  either  study  participants’  failing  to  re¬ 
spond  to  certain  questions,  missed  observations,  or  inaccurate  data  that 
were  rejected  from  the  database.  Researchers  and  data  analysts  often  do 
not  want  to  include  certain  cases  with  missing  data  because  they  may  po¬ 
tentially  skew  the  results.  Therefore,  most  statistical  packages  (e.g.,  SPSS, 
SAS)  will  provide  the  option  of  ignoring  cases  in  which  certain  variables 
are  considered  missing,  or  they  will  automatically  treat  blank  values  as 
missing.  These  programs  also  typically  allow  the  researcher  to  designate 
specific  values  to  represent  missing  data  (e.g.,  —99).  A  small  sample  of  the 
many  techniques  used  for  imputing  missing  data  values  are  discussed  in 
Rapid  Reference  7.1. 


~  flap/d Reference  // 


Missing  Value  Imputation 

Virtually  all  databases  have  some  number  of  missing  values.  Unfortunately, 
statistical  analysis  of  data  sets  with  missing  values  can  result  in  biased  re¬ 
sults  and  incorrect  inferences.  Although  numerous  techniques  have  been 
offered  to  impute  missing  values,  there  is  an  ongoing  debate  in  contem¬ 
porary  statistics  as  to  which  technique  is  the  most  appropriate.  A  few  of 
the  more  widely  used  imputation  techniques  include  the  following: 

Hot  deck  imputation:  In  this  imputation  technique,  the  researcher 
matches  participants  on  certain  variables  to  identify  potential  donors. 
Missing  values  are  then  replaced  with  values  taken  from  matching  respon¬ 
dents  (i.e.,  respondents  who  are  matched  on  a  set  of  relevant  factors). 

Predicted  mean  imputation:  Imputed  values  are  predicted  using  cer¬ 
tain  statistical  procedures  (i.e.,  linear  regression  for  continuous  data  and 
discriminant  function  for  dichotomous  or  categorical  data). 

Last  value  carried  forward:  Imputed  values  are  based  on  previously 
observed  values. This  method  can  be  used  only  for  longitudinal  variables, 
for  which  participants  have  values  from  previous  data  collection  points. 

Group  means:  Imputed  variables  are  determined  by  calculating  the  vari¬ 
able’s  group  mean  (or  mode,  in  the  case  of  categorical  data). 
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Computing  Totals  and  New  Variables 

In  certain  instances,  the  researcher  may  want  to  create  new  variables  based 
on  values  from  other  variables.  For  example,  suppose  a  researcher  has  data 
on  the  total  number  of  times  clients  in  two  different  treatments  attended 
their  treatments  each  month.  The  researcher  would  have  a  total  of  four 
variables,  each  representing  the  number  of  sessions  attended  each  week 
during  the  first  month  of  treatment.  Let’s  call  them  ql,  q2,  q3,  and  q4.  If 
the  researcher  wanted  to  analyze  monthly  attendance  by  the  different 
treatments,  he  or  she  would  have  to  compute  a  new  variable.  This  could  be 
done  with  the  following  transformation: 

total  =  ql  +  q2  +  q3  +  q4 

Still  another  reason  for  transforming  variables  is  that  the  variable  may 
not  be  normally  distributed  (see  Rapid  Reference  7.2).  This  can  substan¬ 
tially  alter  the  results  of  the  data  analysis.  In  such  instances,  certain  data 
transformations  (see  Rapid  Reference  7.3)  may  serve  to  normalize  the  dis¬ 
tribution  and  improve  the  accuracy  of  outcomes. 

Reversing  Scale  Items 

Many  instruments  and  measures  use  items  with  reversed  scales  to  decrease 
the  likelihood  of  participants’  falling  into  what  is  referred  to  as  a  “response 
set.”  A  response  set  occurs  when  a  participant  begins  to  respond  in  a  pat¬ 
terned  manner  to  questions  or 
statements  on  a  test  or  assessment 
measure,  regardless  of  the  content 
of  each  query  or  statement.  For 
example,  an  individual  may  an¬ 
swer  false  to  all  test  items,  or  may 

a  response  from  1  to  5.  Here’s  an 
example  of  how  reverse  scale 
items  work:  Let’s  say  that  partici¬ 
pants  in  a  survey  are  asked  to  indi¬ 
cate  their  levels  of  agreement, 


provide  a  1  for  all  items  requesting 


^  fap/d Reference  7.2 


Normal  Distributions 

A  normal  distribution  is  a  distribu¬ 
tion  of  the  values  of  a  variable 
that,  when  plotted,  produces  a 
symmetrical,  bell-shaped  curve 
that  rises  smoothly  from  a  small 
number  of  cases  at  each  extreme 
to  a  large  number  of  cases  in  the 
middle. 
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— flap/d Reference  7J 


Data  Transformations 

Most  statistical  procedures  assume  that  the  variables  being  analyzed  are 
normally  distributed.  Analyzing  variables  that  are  not  normally  distributed 
can  lead  to  serious  overestimation  (Type  I  error)  or  underestimation 
(Type  II  error).Therefore,  before  analyzing  their  data,  researchers  should 
carefully  examine  variable  distributions.  Although  this  is  often  done  by 
simply  looking  overthe  frequency  distributions,  there  are  many,  more- 
objective  methods  of  determining  whether  variables  are  normally  distrib¬ 
uted. Typically,  these  involve  examining  each  variable’s  skewness,  which 
measures  the  overall  lack  of  symmetry  of  the  distribution,  and  whether  it 
looks  the  same  to  the  left  and  right  of  the  center  point;  and  its  kurtosis, 
which  measures  whether  the  data  are  peaked  or  flat  relative  to  a  normal 
distribution.  Unfortunately,  many  variables  in  the  social  sciences  and  within 
particular  sample  populations  are  not  normally  distributed. Therefore,  re¬ 
searchers  often  rely  on  one  of  several  transformations  to  potentially  im¬ 
prove  the  normality  of  certain  variables. The  most  frequently  used  trans¬ 
formations  are  the  square  root  transformation,  the  log  transformation, 
and  the  inverse  transformation. 

Square  root  transformation:  Described  simply,  this  type  of  transfor¬ 
mation  involves  taking  the  square  root  of  each  value  within  a  certain  vari¬ 
able. The  one  caveat  is  that  you  cannot  take  a  square  root  of  a  negative 
number  Fortunately,  this  can  be  easily  remedied  by  adding  a  constant, 
such  as  I ,  to  each  item  before  computing  the  square  root. 

Log  transformation:There  is  a  wide  variety  of  log  transformations.  In 
general,  however,  a  logarithm  is  the  power  (also  known  as  the  exponent) 
to  which  a  base  number  has  to  be  raised  to  get  the  original  number  As 
with  square  root  transformation,  if  a  variable  contains  values  less  than  I ,  a 
constant  must  be  added  to  move  the  minimum  value  of  the  distribution. 

Inverse  transformation:  This  type  of  transformation  involves  taking 
the  inverse  of  each  value  by  dividing  it  into  I .  For  example,  the  inverse  of 
3  would  be  computed  as  1/3.  Essentially,  this  procedure  makes  very  small 
values  very  large,  and  very  large  values  very  small,  and  it  has  the  effect  of 
reversing  the  order  of  a  variable’s  scores. Therefore,  researchers  using  this 
transformation  procedure  should  be  careful  not  to  misinterpret  the 
scores  following  their  analysis. 
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from  1  to  5,  with  a  series  of  statements.  In  this  survey,  1  corresponds  with 
completely  disagree  and  5  corresponds  with  completely  agree.  The  researcher 
may  decide,  however,  to  reverse-scale  some  of  the  items  on  the  survey,  so 
that  1  corresponds  with  completely  agree  and  5  corresponds  with  completely 
disagree.  This  may  reduce  the  likelihood  that  participants  will  fall  into  a  re¬ 
sponse  set.  Before  data  can  be  analyzed,  it  is  important  that  all  reversed 
items  are  recoded  so  that  all  of  the  responses  fall  in  the  same  direction. 

Recoding  Variables 

Some  variables  may  be  more  easily  analyzed  if  they  are  recoded  into  cate¬ 
gories.  For  example,  a  researcher  may  wish  to  collapse  income  estimates 
or  ages  into  specific  ranges.  This  is  an  example  of  turning  a  continuous 
variable  into  a  categorical  variable  (as  was  discussed  in  Chapter  2).  Al¬ 
though  categorizing  continuous  variables  may  ultimately  reduce  their 
specificity,  in  some  cases  it  may  be  warranted  to  simplify  data  analysis  and 
interpretation.  In  other  instances,  it  may  be  necessary  to  recategorize  or 
recode  categorical  variables  by  combining  them  into  fewer  categories. 
This  is  often  the  case  when  variables  have  so  many  categories  that  certain 
categories  are  sparsely  populated,  which  may  violate  the  assumptions  of 
certain  statistical  analyses.  To  resolve  this  issue,  researchers  may  choose  to 
combine  or  collapse  certain  categories. 

Once  the  data  have  been  screened,  entered,  cleaned,  and  transformed, 
they  should  be  ready  to  be  analyzed.  It  is  possible,  of  course,  that  the  data 
will  need  to  be  recoded  or  transformed  again  during  the  analyses.  In  fact, 
the  need  for  many  of  the  transformations  discussed  previously  will  not  be 
identified  until  the  analyses  have  begun.  Still,  taking  the  time  to  carefully 
prepare  the  data  first  should  make  data  analysis  more  efficient  and  im¬ 
prove  the  overall  validity  of  the  study’s  findings. 


DATA  ANALYSIS 

As  mentioned  earlier,  research  data  can  be  seen  as  the  fruit  of  researchers’ 
labor.  If  a  study  has  been  conducted  in  a  scientifically  rigorous  manner,  the 
data  will  hold  the  clues  necessary  to  answer  the  researchers’  questions.  To 
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unlock  these  clues,  researchers  typically  rely  on  a  variety  of  statistical  pro¬ 
cedures.  These  statistical  procedures  allow  researchers  to  describe  groups 
of  individuals  and  events,  examine  the  relationships  between  different 
variables,  measure  differences  between  groups  and  conditions,  and  exam¬ 
ine  and  generalize  results  obtained  from  a  sample  back  to  the  population 
from  which  the  sample  was  drawn.  Knowledge  about  data  analysis  can 
help  a  researcher  interpret  data  for  the  purpose  of  providing  meaningful 
insights  about  the  problem  being  examined. 

Although  a  comprehensive  review  of  statistical  procedures  is  beyond 
the  scope  of  this  text,  in  general,  they  can  be  broken  down  into  two  major 
areas:  descriptive  and  inferential.  Descriptive  statistics  allow  the  researcher  to 
describe  the  data  and  examine  relationships  between  variables,  while  infer¬ 
ential  statistics  allow  the  researcher  to  examine  causal  relationships.  In  many 
cases,  inferential  statistics  allow  researchers  to  go  beyond  the  parameters 
of  their  study  sample  and  draw  conclusions  about  the  population  from 
which  the  sample  was  drawn.  This  section  will  provide  a  brief  overview  of 
some  of  the  more  commonly  used  descriptive  and  inferential  statistics. 

Descriptive  Statistics 

As  their  name  implies,  descriptive  statistics  are  used  to  describe  the  data 
collected  in  research  studies  and  to  accurately  characterize  the  variables 
under  observation  within  a  specific  sample.  Descriptive  analyses  are  fre¬ 
quently  used  to  summarize  a  study  sample  prior  to  analyzing  a  study’s  pri¬ 
mary  hypotheses.  This  provides  information  about  the  overall  representa¬ 
tiveness  of  the  sample,  as  well  as  the  information  necessary  for  other 
researchers  to  replicate  the  study,  if  they  so  desire.  In  other  research  ef¬ 
forts  (i.e.,  purely  descriptive  studies),  precise  and  comprehensive  descrip¬ 
tions  may  be  the  primary  focus  of  the  study.  In  either  case,  the  principal 
objective  of  descriptive  statistics  is  to  accurately  describe  distributions  of 
certain  variables  within  a  specific  data  set. 

There  is  a  variety  of  methods  for  examining  the  distribution  of  a  vari¬ 
able.  Perhaps  the  most  basic  method,  and  the  starting  point  and  founda¬ 
tion  of  virtually  all  statistical  analyses,  is  the  frequency  distribution.  A 
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frequency  distribution  is  simply  a  complete  list  of  all  possible  values  or  scores 
for  a  particular  variable,  along  with  the  number  of  times  (frequency)  that 
each  value  or  score  appears  in  the  data  set.  For  example,  teachers  and  in¬ 
structors  who  want  to  know  how  their  classes  perform  on  certain  exams 
will  need  to  examine  the  overall  distribution  of  the  test  scores.  The  teacher 
would  begin  by  sorting  the  scores  so  that  they  go  from  the  lowest  to  the 
highest  and  then  count  the  number  of  times  that  each  score  occurred.  This 
information  can  be  delineated  in  what  is  known  as  a  frequency  table,  which 
is  illustrated  in  Table  7.1. 

To  make  the  distribution  of  scores  even  more  informative,  the  teacher 
could  group  the  test  scores  together  in  some  manner.  For  example,  the 


Table  7. 1  Frequency  Distribution  of  Test  Scores 


Value 

Frequency 

Cumulative  Frequency 

71 

1 

1 

76 

1 

2 

78 

2 

4 

81 

2 

6 

82 

1 

7 

83 

1 

8 

84 

2 

10 

85 

2 

12 

86 

2 

14 

87 

1 

15 

89 

1 

16 

90 

2 

18 

94 

3 

21 

98 

1 

22 

100 

1 

23 
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Table  7.2  Grouped  Frequency  Distribution  of  Test  Scores 


Value 

Frequency 

Cumulative  Frequency 

71-75 

1 

1 

76-80 

3 

4 

81-85 

8 

12 

86-90 

6 

18 

91-95 

3 

21 

96-100 

2 

23 

teacher  may  decide  to  group  the  test  scores  from  71  to  75,  76  to  80,  81  to 
85,  86  to  90, 91  to  95,  and  96  to  100.  This  type  of  grouping  would  result  in 
the  frequency  distribution  shown  in  Table  7.2. 

Still  another  way  that  this  distribution  may  be  depicted  is  in  what  is 
known  as  a  histogram.  A  histogram  (see  Figure  7.1)  is  nothing  more  than  a 
graphic  display  of  the  same  information  contained  in  the  frequency  tables 
shown  in  Tables  7.1  and  7.2. 
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Although  frequency  tables  and  histograms  provide  researchers  with  a 
general  overview  of  the  distribution,  there  are  more  precise  ways  of  de¬ 
scribing  the  shape  of  the  distribution  of  values  for  a  specific  variable. 
These  include  measures  of  central  tendency  and  dispersion. 

Central  Tendency 

The  central  tendency  of  a  distribution  is  a  number  that  represents  the  typical 
or  most  representative  value  in  the  distribution.  Measures  of  central  ten¬ 
dency  provide  researchers  with  a  way  of  characterizing  a  data  set  with  a 
single  value.  The  most  widely  used  measures  of  central  tendency  are  the 
mean,  median,  and  mode. 

The  mean,  except  in  statistics  courses  and  scientific  journals,  is  more 
commonly  known  as  the  average.  The  mean  is  perhaps  the  most  widely 
used  and  reported  measure  of  central  tendency.  The  mean  is  quite  simple 
to  calculate:  Simply  add  all  the  numbers  in  the  data  set  and  then  divide  by 
the  total  number  of  entries.  The  result  is  the  mean  of  the  distribution.  For 
example,  let’s  say  that  we  are  trying  to  describe  the  mean  age  of  a  group 
of  10  study  participants  with  the  following  ages: 

34  27  23  23  26  27  28  23  32  41 

The  summed  ages  for  the  1 0  participants  is  284.  Therefore,  the  mean  age 
of  the  sample  is  284/10  =  28.40. 

The  mean  is  quite  accurate  when  the  data  set  is  normally  distributed. 
Unfortunately,  the  mean  is  strongly  influenced  by  extreme  values  or  out¬ 
liers.  Therefore,  it  may  be  misleading  in  data  sets  in  which  the  values  are 
not  normally  distributed,  or  where  there  are  extreme  values  at  one  end  of 
the  data  set  (skewed  distributions) . 

For  example,  consider  a  situation  in  which  study  participants  report  an¬ 
nual  earnings  of  between  $25,000  and  $40,000.  The  mean  annual  income 
for  the  sample  might  wind  up  being  around  $35,000.  Now  consider  what 
would  happen  if  one  or  two  of  the  participants  reported  earnings  of 
$100,000  or  more.  Their  substantially  higher  salaries  (outliers)  would  dis¬ 
proportionately  increase  the  mean  income  for  the  entire  sample.  In  such 
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instances,  a  median  or  mode  may  provide  much  more  meaningful  sum¬ 
mary  information. 

The  median,  as  implied  by  its  name,  is  the  middle  value  in  a  distribution 
of  values.  To  calculate  the  median,  simply  sort  all  of  the  values  from  low¬ 
est  to  highest  and  then  identify  the  middle  value.  The  middle  value  is  the 
median.  For  example,  sorting  the  set  of  ages  in  the  previous  example 
would  result  in  the  following: 

23  23  23  26  27  27  28  32  34  41 

In  this  instance,  the  median  is  27,  because  the  two  middle  values  are 
both  27,  with  four  values  on  either  side.  If  the  two  values  were  different, 
you  would  simply  split  the  difference  to  get  the  median.  For  example,  if  the 
two  middle  values  were  27  and  28,  the  median  would  be  27.5.  Calculation 
of  the  median  is  even  simpler  when  the  data  set  has  an  odd  number  of  val¬ 
ues.  In  these  cases,  the  median  is  simply  the  value  that  falls  exactly  in  the 
middle. 

The  mode  is  yet  another  useful  measure  of  central  tendency.  The  mode 
is  the  value  that  occurs  most  frequently  in  a  set  of  values.  To  find  the  mode, 
simply  count  the  number  of  times  (frequency)  that  each  value  appears  in  a 
data  set.  The  value  that  occurs  most  frequently  is  the  mode.  For  example, 
by  examining  the  sorted  distribution  of  ages  listed  below,  we  could  easily 
see  that  the  most  prevalent  age  in  the  sample  is  23,  which  is  therefore  the 
mode. 


23  23  23  26  27  27  28  32  34  41 

With  larger  data  sets,  the  mode  is  more  easily  identified  by  examining  a 
frequency  table,  as  described  earlier.  The  mode  is  very  useful  with  nomi¬ 
nal  and  ordinal  data  or  when  the  data  are  not  normally  distributed,  because 
it  is  not  influenced  by  extreme  values  or  outliers.  Therefore,  the  mode  is  a 
good  summary  statistic  even  in  cases  when  distributions  are  skewed.  Also 
note  that  a  distribution  can  have  more  than  one  mode.  Two  modes  would 
make  the  distribution  bimodal,  while  a  distribution  having  three  modes 
would  be  referred  to  as  trimodal. 
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Interestingly,  although  the  three  measures  of  central  tendency  resulted 
in  different  values  in  the  previous  examples,  in  a  perfectly  normal  distri¬ 
bution,  the  mean,  median,  and  mode  would  all  be  the  same. 

Dispersion 

Measures  of  central  tendency,  like  the  mean,  describe  the  most  likely  value, 
but  they  do  not  tell  us  anything  about  how  the  values  vary.  For  example, 
two  sets  of  data  can  have  the  same  mean,  but  they  may  vary  gready  in  the 
way  that  their  values  are  spread  out.  Another  way  of  describing  the  shape 
of  a  distribution  is  to  examine  this  spread.  The  spread,  more  technically  re¬ 
ferred  to  as  the  dispersion,  of  a  distribution  provides  us  with  information 
about  how  tighdy  grouped  the  values  are  around  the  center  of  the  distri¬ 
bution  (e.g.,  around  the  mean,  median,  and/or  mode).  The  most  widely 
used  measures  of  dispersion  are  range,  variance,  and  standard  deviation. 

The  range  of  a  distribution  tells  us  the  smallest  possible  interval  in  which 
all  the  data  in  a  certain  sample  will  fall.  Quite  simply,  the  range  is  the  dif¬ 
ference  between  the  highest  and  lowest  values  in  a  distribution.  Therefore, 
the  range  is  easily  calculated  by  subtracting  the  lowest  value  from  the  high¬ 
est  value.  Using  our  previous  example,  the  range  of  ages  for  the  study 
sample  would  be: 

41-23  =  18 

Because  it  depends  on  only  two  values  in  the  distribution,  it  is  usually  a 
poor  measure  of  dispersion,  except  when  the  sample  size  is  particularly 
large. 

A  more  precise  measure  of  dispersion,  or  spread  around  the  mean  of  a 
distribution,  is  the  variance.  The  variance  gives  us  a  sense  of  how  closely 
concentrated  a  set  of  values  is  around  its  average  value,  and  is  calculated  in 
the  following  manner: 

1 .  Subtract  the  mean  of  the  distribution  from  each  of  the  values. 

2.  Square  each  result. 

3.  Add  all  of  the  squared  results. 

4.  Divide  the  result  by  the  number  of  values  minus  1 . 
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The  variance  of  the  set  of  10  participant  ages  would  therefore  be  calcu¬ 
lated  in  the  following  manner: 

Variance  =  [(23  -  28.40)2  +  (23  -  28.40)2  +  (23  -  28.40)2  +  (26  -  28.40)2 

+  (27  -  28.40)2  +  (27  -  28.40)2  +  (28  -  28.40)2  +  (32  -  28.40)2 

+  (34  -  28.40)2  +  (41  -  28.40)2]  +  9  =  33.37 

The  variance  of  a  distribution  gives  us  an  average  of  how  far,  in  squared 
units,  the  values  in  a  distribution  are  from  the  mean,  which  allows  us  to  see 
how  closely  concentrated  the  scores  in  a  distribution  are. 

Another  measure  of  the  spread  of  values  around  the  mean  of  a  distri¬ 
bution  is  the  standard  deviation.  The  standard  deviation  is  simply  the  square 
root  of  the  variance.  Therefore,  the  standard  deviation  for  the  set  of  par¬ 
ticipant  ages  is: 

V33.37  =  5.78 

By  taking  the  square  root  of  the  variance,  we  can  avoid  having  to  think  in 
terms  of  squared  units.  The  variance  and  the  standard  deviation  of  distri¬ 
butions  are  the  basis  for  calculating  many  other  statistics  that  estimate 
associations  and  differences  between  variables.  In  addition,  they  provide  us 
with  important  information  about  the  values  in  a  distribution.  For  ex¬ 
ample,  if  the  distribution  of  values  is  normal,  or  close  to  normal,  one  can 
conclude  the  following  with  reasonable  certainty: 

1 .  Approximately  68%  of  the  values  fall  within  1  standard  devia¬ 
tion  of  the  mean. 

2.  Approximately  95%  of  the  values  fall  within  2  standard  devia¬ 
tions  of  the  mean. 

3.  Approximately  99%  of  the  values  fall  within  3  standard  devia¬ 
tions  of  the  mean. 

Therefore,  assuming  that  the  distribution  is  normal,  we  can  estimate  that 
because  the  mean  age  of  participants  was  28.40  and  the  standard  deviation 
was  5.78,  approximately  68%  of  the  participants  are  within  +5.78  years  (1 
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standard  deviation)  of  the  mean  age  of  28.40.  Similarly,  we  can  estimate 
that  95%  of  the  participants  are  within  ±11.56  years  (2  standard  devia¬ 
tions)  of  the  mean  age  of  28.40.  This  information  has  several  important 
applications.  First,  like  the  measures  of  central  tendency,  it  allows  the  re¬ 
searcher  to  describe  the  overall  characteristics  of  a  sample.  Second,  it  al¬ 
lows  researchers  to  compare  individual  participants  on  a  given  variable 
(e.g.,  age).  Third,  it  provides  a  way  for  researchers  to  compare  an  individ¬ 
ual  participant’s  performance  on  one  variable  (e.g.,  IQ  score)  with  his  or 
her  performance  on  another  (e.g.,  SAT  score),  even  when  the  variables  are 
measured  on  entirely  different  scales. 

Measures  of  Association 

In  addition  to  describing  the  shape  of  variable  distributions,  another  im¬ 
portant  task  of  descriptive  statistics  is  to  examine  and  describe  the  rela¬ 
tionships  or  associations  between  variables. 

Correlations  are  perhaps  the  most  basic  and  most  useful  measure  of  as¬ 
sociation  between  two  or  more  variables.  Expressed  in  a  single  number 
called  a  correlation  coefficient  (r) ,  correlations  provide  information  about  the 
direction  of  the  relationship  (either  positive  or  negative)  and  the  intensity 
of  the  relationship  (—1.0  to  +1.0).  Furthermore,  tests  of  correlations  will 
provide  information  on  whether  the  correlation  is  statistically  significant. 
There  is  a  wide  variety  of  correlations  that,  for  the  most  part,  are  deter¬ 
mined  by  the  type  of  variable  (e.g.,  categorical,  continuous)  being  ana¬ 
lyzed. 

With  regard  to  the  direction  of  a  correlation,  if  two  variables  tend  to 
move  in  the  same  direction  (e.g.,  height  and  weight),  they  would  be  con¬ 
sidered  to  have  a  positive  or  direct  relationship.  Alternatively,  if  two  variables 
move  in  opposite  directions  (e.g.,  cigarette  smoking  and  lung  capacity), 
they  are  considered  to  have  a  negative  or  inverse  relationship.  Figure  7.2  gives 
examples  of  both  types. 

Correlation  coefficients  range  from  —1.0  to  +  1.0.  The  sign  of  the  co¬ 
efficient  represents  the  direction  of  the  relationship.  For  example,  a  cor¬ 
relation  of  .78  would  indicate  a  positive  or  direct  correlation,  while  a  cor¬ 
relation  of  —.78  would  indicate  a  negative  or  inverse  correlation.  The 
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NEGATIVE  CORRELATION 


Figure  7.2  Positive  and  negative  correlation  directions. 


coefficient  (value)  itself  indicates  the  strength  of  the  relationship.  The 
closer  it  gets  to  1 .0  (whether  it  is  negative  or  positive),  the  stronger  the  re¬ 
lationship.  In  general,  correlations  of  .01  to  .30  are  considered  small,  cor¬ 
relations  of  .30  to  .70  are  considered  moderate,  correlations  of  .70  to  .90 
are  considered  large,  and  correlations  of  .90  to  1.00  are  considered  very 
large.  Importantly,  these  are  only  rough  guidelines.  A  number  of  other  fac¬ 
tors,  such  as  sample  size,  need  to  be  considered  when  interpreting  corre¬ 
lations. 

In  addition  to  the  direction  and  strength  of  a  correlation,  the  coefficient 
can  be  used  to  determine  the  proportion  of  variance  accounted  for  by  the 
association.  This  is  known  as  the  coefficient  of  determination  (r2).  The  coeffi¬ 
cient  of  determination  is  calculated  quite  easily  by  squaring  the  correlation 
coefficient.  For  example,  if  we  found  a  correlation  of  .70  between  cigarette 
smoking  and  use  of  cocaine,  we  could  calculate  the  coefficient  of  deter¬ 
mination  in  the  following  manner: 

.70  X  .70  =  .49 

The  coefficient  of  determination  is  then  transformed  into  a  percentage. 
Therefore,  a  correlation  of  .70,  as  indicated  in  the  equation,  explains  ap¬ 
proximately  49%  of  the  variance.  In  this  example,  we  could  conclude  that 
49%  of  the  variance  in  cocaine  use  is  accounted  for  by  cigarette  smoking. 
Alternatively,  a  correlation  of  .20  would  have  a  coefficient  of  determina¬ 
tion  of  .04  (.20  X  .20  =  .04),  strongly  indicating  that  other  variables  are 
likely  involved.  Importandy,  as  the  reader  might  remember,  correlation  is 
not  causation.  Therefore,  we  cannot  infer  from  this  correlation  that  ciga- 
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rette  smoking  causes  or  influences  cocaine  use.  It  is  equally  as  likely  that 
cocaine  use  causes  cigarette  smoking,  or  that  both  unhealthy  behaviors  are 
caused  by  a  third  unknown  variable. 

Although  correlations  are  typically  regarded  as  descriptive  in  nature, 
they  can — unlike  measures  of  central  tendency  and  dispersion — be 
tested  for  statistical  significance.  Tests  of  significance  allow  us  to  estimate 
the  likelihood  that  a  relationship  between  variables  in  a  sample  actually 
exists  in  the  population  and  is  not  simply  the  result  of  chance.  In  very 
general  terms,  the  significance  of  a  relationship  is  determined  by  com¬ 
paring  the  results  or  findings  with  what  would  occur  if  the  variables  were 
totally  unrelated  (independent)  and  if  the  distributions  of  each  dependent 
variable  were  identical.  The  primary  index  of  statistical  significance  is  the 
A- value.  The  Rvalue  represents  the  probability  of  chance  error  in  deter¬ 
mining  whether  a  finding  is  valid  and  thus  representative  of  the  popula¬ 
tion.  For  example,  if  we  were  examining  the  correlation  between  two  vari¬ 
ables,  a p- value  of  .05  would  indicate  that  there  was  a  5%  probability  that 
the  finding  might  have  been  a  fluke.  Therefore,  assuming  that  there  was 
no  such  relationship  between  those  variables  whatsoever,  we  could  ex¬ 
pect  to  find  a  similar  result,  by  chance,  about  5  times  out  of  100.  In  other 
words,  significance  levels  inform  us  about  the  degree  of  confidence  that 
we  can  have  in  our  findings. 

There  is  a  wide  selection  of  correlations  that,  for  the  most  part,  are  de¬ 
termined  by  the  type  of  scale  (i.e.,  nominal,  ordinal,  interval,  or  ratio)  on 
which  the  variables  are  measured.  One  of  the  most  widely  used  correla¬ 
tions  is  the  Pearson  product-moment  correlation,  often  referred  to  as  the 
Pearson  r.  The  Pearson  r  is  used  to  examine  associations  between  two  vari¬ 
ables  that  are  measured  on  either  ratio  or  interval  scales.  For  example,  the 
Pearson  r  could  be  used  to  examine  the  correlation  between  days  of  exer¬ 
cise  and  pounds  of  weight  loss. 

Other  types  of  correlations  include  the  following: 

•  Point-biserial  (r  bi):  This  is  used  to  examine  the  relationship  be¬ 
tween  a  variable  measured  on  a  naturally  occurring  dichotomous 
nominal  scale  and  a  variable  measured  on  an  interval  (or  ratio) 
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scale  (e.g.,  a  correlation  between  gender  [dichotomous]  and  SAT 
scores  [interval]). 

•  Spearman  rank-order  (r  ):  This  is  used  to  examine  the  relationship 
between  two  variables  measured  on  ordinal  scales  (e.g.,  a  correla¬ 
tion  of  class  rank  [ordinal]  and  socioeconomic  status  [ordinal]). 

•  Phi  (<E>):  This  is  used  to  examine  the  relationship  between  two 
variables  that  are  naturally  dichotomous  (nominal-dichotomous; 
e.g.,  a  correlation  of  gender  [nominal]  and  marital  status 
[nominal-dichotomous]) . 

•  Gamma  (y):  This  is  used  to  examine  the  relationship  between  one 
nominal  variable  and  one  variable  measured  on  an  ordinal  scale 
(e.g.,  a  correlation  of  ethnicity  [nominal]  and  socioeconomic  sta¬ 
tus  [ordinal]). 


Inferential  Statistics 

In  the  previous  section,  we  provided  a  general  overview  of  the  most  widely 
used  descriptive  statistics,  including  measures  of  central  tendency,  disper¬ 
sion,  and  correlation.  In  addition  to  describing  and  examining  associations 
of  variables  within  our  data  sets,  we  often  conduct  research  to  answer 
questions  about  the  greater  population.  Because  it  would  not  be  feasible 
to  collect  data  from  the  entire  population,  researchers  conduct  research 
with  representative  samples  (see  Chapters  2  and  3)  in  an  attempt  to  draw 
inferences  about  the  populations  from  which  the  samples  were  drawn. 
The  analyses  used  to  examine  these  inferences  are  appropriately  referred 
to  as  inferential  statistics. 

Inferential  statistics  help  us  to  draw  conclusions  beyond  our  immediate 
samples  and  data.  For  example,  inferential  statistics  could  be  used  to  infer, 
from  a  relatively  small  sample  of  employees,  what  the  job  satisfaction  is 
likely  to  be  for  a  company’s  entire  work  force.  Similarly,  inferential  statis¬ 
tics  could  be  used  to  infer,  from  between-group  differences  in  a  particular 
study  sample,  how  effective  a  new  treatment  or  medication  may  be  for  a 
larger  population.  In  other  words,  inferential  statistics  help  us  to  drawgen- 
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eral  conclusions  about  the  population  on  the  basis  of  the  findings  identi¬ 
fied  in  a  sample.  However,  as  with  any  generalization,  there  is  some  degree 
of  uncertainty  or  error  that  must  be  considered.  Fortunately,  inferential 
statistics  provide  us  with  not  only  the  means  to  make  inferences,  but  the 
means  to  specify  the  amount  of  probable  error  as  well. 

Inferential  statistics  typically  require  random  sampling.  As  discussed  in 
Chapters  2  and  3,  this  increases  the  likelihood  that  a  sample,  and  the  data 
that  it  generates,  are  representative  of  the  population.  Although  there  are 
other  techniques  for  acquiring  a  representative  sample  (e.g.,  selecting  in¬ 
dividuals  that  match  the  population  on  the  most  important  characteris¬ 
tics),  random  sampling  is  considered  to  be  the  best  method,  because  it 
works  to  ensure  representativeness  on  all  characteristics  of  the  popula¬ 
tion — even  those  that  the  researcher  may  not  have  considered. 

Inferences  begin  with  the  formulation  of  specific  hypotheses  about  what 
we  expect  to  be  true  in  the  population.  However,  as  discussed  in  Chapter  2, 
we  can  never  actually  prove  a  hypothesis  with  complete  certainty.  Therefore, 
we  must  test  the  null  hypothesis,  and  determine  whether  it  should  be  re¬ 
tained  or  rejected.  For  example,  in  a  randomized  controlled  trial  (see  Chap¬ 
ter  5),  we  may  expect,  based  on  prior  research,  that  a  group  receiving  a 
certain  treatment  would  have  better  outcomes  than  a  group  receiving  a 
standard  treatment.  In  this  case,  the  null  hypothesis  would  predict  no 
between-group  differences.  Similarly,  in  the  case  of  correlation,  the  null  hy¬ 
pothesis  would  predict  that  the  variables  in  question  would  not  be  related. 

There  are  numerous  inferential  statistics  for  researchers  to  choose 
from.  The  selection  of  the  appropriate  statistics  is  largely  determined  by 
the  nature  of  the  research  question  being  asked  and  the  types  of  variables 
being  analyzed.  Because  a  comprehensive  review  of  inferential  statistics 
could  fill  many  volumes  of  text,  we  will  simply  provide  a  basic  overview  of 
several  of  the  most  widely  used  inferential  statistical  procedures,  including 
the  /-test,  analysis  of  variance  (ANOVA),  chi-square,  and  regression. 

T -Test 

T- tests  are  used  to  test  mean  differences  between  two  groups.  In  general, 
they  require  a  single  dichotomous  independent  variable  (e.g.,  an  experi- 
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mental  and  a  control  group)  and  a  single  continuous  dependent  variable. 
For  example,  /-tests  can  be  used  to  test  for  mean  differences  between  ex¬ 
perimental  and  control  groups  in  a  randomized  experiment,  or  to  test  for 
mean  differences  between  two  groups  in  a  nonexperimental  context  (such 
as  whether  cocaine  and  heroin  users  report  more  criminal  activity).  When 
a  researcher  wishes  to  compare  the  average  (mean)  performance  between 
two  groups  on  a  continuous  variable,  he  or  she  should  consider  the  /-test. 

Analysis  of  Variance  (ANOVA) 

Often  characterized  as  an  omnibus  t -test,  an  ANOVA  is  also  a  test  of  mean 
comparisons.  In  fact,  one  of  the  only  differences  between  a  /-test  and  an 
ANOVA  is  that  the  ANOVA  can  compare  means  across  more  than  two 
groups  or  conditions.  Therefore,  a  /-test  is  just  a  special  case  of  ANOVA. 
If  you  analyze  the  means  of  two  groups  by  ANOVA,  you  get  the  same  re¬ 
sults  as  doing  it  with  a  /-test.  Although  a  researcher  could  use  a  series  of 
/-tests  to  examine  the  differences  between  more  than  two  groups,  this 
would  not  only  be  less  efficient,  but  it  would  add  experiment-wise  error 
(see  Rapid  Reference  7.4),  thereby  increasing  the  chances  of  spurious  re¬ 
sults  (i.e.,  Type  I  errors;  see  Chapter  1)  and  compromising  statistical  con¬ 
clusion  validity. 

Interestingly,  despite  its  name,  the  ANOVA  works  by  comparing  the 
differences  between  group  means  rather  than  the  differences  between 
group  variances.  The  name  “analysis  of  variance”  comes  from  the  way  the 
procedure  uses  variances  to  decide  whether  the  means  are  different. 

There  are  numerous  different  variations  of  the  ANOVA  procedure  to 
choose  from,  depending  on  the  study  hypothesis  and  research  design.  For 
example,  a  one-way  ANOVA  is  used  to  compare  the  means  of  two  or  more 
levels  of  a  single  independent  variable.  So,  we  may  use  an  ANOVA  to 
examine  the  differential  effects  of  three  types  of  treatment  on  level  of 
depression. 


Treatment  for  Depression 

Treatment  1 

Treatment  2 

Treatment  3 
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Multiple  Comparisons  and  Experiment-wise  Error 

Most  research  studies  perform  many  tests  of  their  hypotheses.  For  ex¬ 
ample,  a  researcher  testing  a  new  educational  technique  may  choose  to 
examine  the  technique’s  effectiveness  by  measuring  students’ test  scores, 
satisfaction  ratings,  class  grades,  and  SAT  scores.  If  there  is  a  5%  chance 
(with  a  p-value  of  .05)  of  finding  a  significant  result  on  one  outcome  mea¬ 
sure,  there  is  a  20%  chance  (.05  x  4)  of  finding  a  significant  result  when 
using  four  outcome  measures.This  inflated  likelihood  of  achieving  a  signifi¬ 
cant  result  is  referred  to  as  experiment-wise  error. This  can  be  corrected 
for  either  by  using  a  statistical  test  that  takes  this  error  into  account  (e.g., 
multiple  ANOVA,  or  MANOVA;  see  text)  or  by  lowering  the  p-value  to 
account  for  the  number  of  comparisons  being  performed. The  simplest 
and  the  most  conservative  method  of  controlling  for  experiment-wise  er¬ 
ror  is  the  B onferroni  correction.  Using  this  correction,  the  researcher  simply 
divides  the  set  p-value  by  the  number  of  statistical  comparisons  being 
made  (e.g.,  .05/4  =  .0 1 25). The  resulting  p-value  is  then  the  new  criterion 
that  must  be  obtained  to  reach  statistical  significance. 


Alternatively,  multifactor  AN OVAs  can  be  used  when  a  study  involves 
two  or  more  independent  variables.  For  example,  a  researcher  might  em¬ 
ploy  a  2  X  3  factorial  design  (see  Chapter  5)  to  examine  the  effectiveness 
of  the  different  treatments  (Factor  1)  and  high  or  low  levels  of  physical  ex¬ 
ercise  (Factor  2)  in  reducing  symptoms  of  depression. 


Treatment  for  Depression 

Treatment  1 

Treatment  2 

Treatment  3 

Low 

High 

Because  the  study  involves  two  factors  (or  independent  variables),  the 
researcher  would  conduct  a  two-way  ANOVA.  Similarly,  if  the  study  had 
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three  factors,  a  three-way  AN OVA  would  be  used,  and  so  forth.  A  multi¬ 
factor  ANOVA  allows  a  researcher  to  examine  not  only  the  main  effects 
of  each  independent  variable  (the  different  treatments  and  high  or  low  lev¬ 
els  of  exercise)  on  depression,  but  also  the  potential  interaction  of  the  two 
independent  variables  in  combination. 

Still  another  variant  of  the  ANOVA  is  the  multiple  analysis  of  variance,  or 
MAN OVA.  The  MAN OVA  is  used  when  there  are  two  or  more  depen¬ 
dent  variables  that  are  generally  related  in  some  way.  Using  the  previous 
example,  let’s  say  that  we  were  measuring  the  effect  of  the  different  treat¬ 
ments,  with  or  without  exercise,  on  depression  measured  in  several  differ¬ 
ent  ways.  Although  we  could  conduct  separate  ANOVAs  for  each  of  these 
outcomes,  the  MAN OVA  provides  a  more  efficient  and  more  informative 
way  of  analyzing  the  data. 

Chi-Square  (yf) 

The  inferential  statistics  that  we  have  discussed  so  far  (i.e.,  /-tests, 
ANOVA)  are  appropriate  only  when  the  dependent  variables  being  mea¬ 
sured  are  continuous  (interval  or  ratio) .  In  contrast,  the  chi-square  statistic 
allows  us  to  test  hypotheses  using  nominal  or  ordinal  data.  It  does  this 
by  testing  whether  one  set  of  proportions  is  higher  or  lower  than  you 
would  expect  by  chance.  Chi-square  summarizes  the  discrepancy  between 
observed  and  expected  frequencies.  The  smaller  the  overall  discrepancy 
is  between  the  observed  and  expected  scores,  the  smaller  the  value  of 
the  chi-square  will  be.  Conversely,  the  larger  the  discrepancy  is  between 
the  observed  and  expected  scores,  the  larger  the  value  of  the  chi-square 
will  be. 

For  example,  in  a  study  of  employment  skills,  a  researcher  may  ran¬ 
domly  assign  consenting  individuals  to  an  experimental  or  a  standard 
skills-training  intervention.  The  researcher  might  hypothesize  that  a 
higher  percentage  of  participants  who  attended  the  experimental  inter¬ 
vention  would  be  employed  at  1  year  follow-up.  Because  the  outcome  be¬ 
ing  measured  is  dichotomous  (employed  or  not  employed),  the  researcher 
could  use  a  chi-square  to  test  the  null  hypothesis  that  employment  at  the 
1  year  follow-up  is  not  related  to  the  skills  training. 
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Similarly,  chi-square  analysis  is  often  used  to  examine  between-group 
differences  on  categorical  variables,  such  as  gender,  marital  status,  or 
grade  level.  The  main  thing  to  remember  is  that  the  data  must  be  nominal 
or  ordinal  because  chi-square  is  a  test  of  proportions.  Also,  because  it 
compares  the  tallies  of  categorical  responses  between  two  or  more  groups, 
the  chi  square  statistic  can  be  conducted  only  on  actual  numbers  and  not 
on  precalculated  percentages  or  proportions. 

Regression 

Linear  regressions  a  method  of  estimating  or  predicting  a  value  on  some  de¬ 
pendent  variable  given  the  values  of  one  or  more  independent  variables. 
Like  correlations,  statistical  regression  examines  the  association  or  rela¬ 
tionship  between  variables.  Unlike  with  correlations,  however,  the  pri¬ 
mary  purpose  of  regression  is  prediction.  For  example,  insurance  ad¬ 
justers  may  be  able  to  predict  or  come  close  to  predicting  a  person’s  life 
span  from  his  or  her  current  age,  body  weight,  medical  history,  history  of 
tobacco  use,  marital  status,  and  current  behavioral  patterns. 

There  are  two  basic  types  of  regression  analysis:  simple  regression  and 
multiple  regression.  In  simple  regression,  we  attempt  to  predict  the  depen¬ 
dent  variable  with  a  single  independent  variable.  In  multiple  regression,  as  in 
the  case  of  the  insurance  adjuster,  we  may  use  any  number  of  independent 
variables  to  predict  the  dependent  variable. 

Logistic  regression,  unlike  its  linear  counterpart,  is  unique  in  its  ability  to 
predict  dichotomous  variables,  such  as  the  presence  or  absence  of  a  spe¬ 
cific  outcome,  based  on  a  specific  set  of  independent  or  predictor  vari¬ 
ables.  Like  correlation,  logistic  regression  provides  information  about  the 
strength  and  direction  of  the  association  between  the  variables.  In  addi¬ 
tion,  logistic  regression  coefficients  can  be  used  to  estimate  odds  ratios  for 
each  of  the  independent  variables  in  the  model.  These  odds  ratios  can  tell  us 
how  likely  a  dichotomous  outcome  is  to  occur  given  a  particular  set  of  in¬ 
dependent  variables. 

A  common  application  of  logistic  regression  is  to  determine  whether 
and  to  what  degree  a  set  of  hypothesized  risk  factors  might  predict  the  on¬ 
set  of  a  certain  condition.  For  example,  a  drug  abuse  researcher  may  wish 
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to  determine  whether  certain  lifestyle  and  behavioral  patterns  place  for¬ 
mer  drug  abusers  at  risk  for  relapse.  The  researcher  may  hypothesize  that 
three  specific  factors — living  with  a  drug  or  alcohol  user,  psychiatric  sta¬ 
tus,  and  employment  status — will  predict  whether  a  former  drug  abuser 
will  relapse  within  1  month  of  completing  drug  treatment.  By  measuring 
these  variables  in  a  sample  of  successful  drug-treatment  clients,  the  re¬ 
searcher  could  build  a  model  to  predict  whether  they  will  have  relapsed  by 
the  1 -month  follow-up  assessment.  The  model  could  also  be  used  to  esti¬ 
mate  the  odds  ratios  for  each  variable.  For  example,  the  odds  ratios  could 
provide  information  on  how  much  more  likely  unemployed  individuals 
are  to  relapse  than  employed  individuals. 

INTERPRETING  DATA  AND  DRAWING  INFERENCES 

Even  researchers  who  carefully  planned  their  studies  and  collected,  man¬ 
aged,  and  analyzed  their  data  with  the  highest  integrity  might  still  make 
mistakes  when  interpreting  their  data.  Unfortunately,  although  all  of  the 
previous  steps  are  necessary,  they  are  far  from  sufficient  to  ensure  that  the 
moral  of  the  story  is  accurately  understood  and  disseminated.  This  section 
will  highlight  some  of  the  most  critical  issues  to  consider  when  interpret¬ 
ing  data  and  drawing  inferences  from  your  findings. 

Are  You  Fully  Powered? 

One  of  the  ways  that  study  findings  can  be  misinterpreted  is  through  in¬ 
sufficient  statistical  power.  Until  fairly  recently,  most  research  studies  were 
conducted  without  any  consideration  of  this  concept.  In  simple  terms,  sta¬ 
tistical  power  is  a  measure  of  the  probability  that  a  statistical  test  will  reject  a 
false  null  hypothesis,  or  in  other  words,  the  probability  of  finding  a  signif¬ 
icant  result  when  there  really  is  one.  The  higher  the  power  of  a  statistical 
test,  the  more  likely  one  is  to  find  statistical  significance  if  the  null  hy¬ 
pothesis  is  actually  false  (i.e.,  if  there  really  is  an  effect). 

For  example,  to  test  the  null  hypothesis  that  Republicans  are  as  intelli¬ 
gent  as  Democrats,  a  researcher  might  recruit  a  random  bipartisan  sample, 
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have  them  complete  certain  measures  of  intelligence,  and  compare  their 
mean  scores  using  a  /-test  or  ANOVA.  If  Republicans  and  Democrats  do 
indeed  differ  on  intelligence  in  the  population,  but  the  sample  data  indi¬ 
cate  that  they  do  not,  a  Type  II  error  has  been  made  (see  Chapter  1  for  a 
discussion  of  Type  I  and  Type  II  errors).  A  potential  reason  that  the  study 
reached  such  a  faulty  conclusion  may  be  that  it  lacked  sufficient  statistical 
power  to  detect  the  actual  differences  between  Republicans  and  Democ¬ 
rats. 

According  to  Cohen  (1988),  studies  should  strive  for  statistical  power 
of  .80  or  greater  to  avoid  Type  II  errors.  Statistical  power  is  largely  deter¬ 
mined  by  three  factors:  (1)  the  significance  criterion  (e.g.,  .05,  .01);  (2)  the 
effect  si^e  (i.e.,  the  magnitude  of  the  differences  between  group  means  or 
other  test  statistics);  and  (3)  the  size  of  the  sample.  Researchers  should  cal¬ 
culate  the  statistical  power  of  each  of  their  planned  analyses  prior  to  be¬ 
ginning  a  study.  This  will  allow  them  to  determine  the  sample  size  neces¬ 
sary  to  obtain  sufficient  power  (>  .80)  based  on  the  set  significance 
criterion  and  the  anticipated  effect  size. 

Unfortunately,  determining  that  there  is  enough  power  at  the  outset  of 
a  study  does  not  always  ensure  that  sufficient  power  will  be  available  at  the 
time  of  the  analysis.  Many  changes  may  occur  in  the  interim.  For  example, 
the  sample  size  may  be  reduced,  due  to  lower  than  expected  recruitment 
rates  or  attrition;  or  the  effect  sizes  may  be  different  than  expected.  In  any 
case,  the  take-home  message  for  researchers  is  that  they  must  always  con¬ 
sider  how  much  power  is  available  to  detect  differences  between  groups. 
This  is  particularly  important  when  interpreting  the  results  of  a  study  in 
which  no  significant  differences  were  found,  because  it  may  be  that  sig¬ 
nificant  differences  existed,  but  there  was  insufficient  power  to  detect 
them. 

Are  Your  Distributions  in  Good  Shape? 

Another  factor  that  can  lead  to  faulty  interpretations  of  statistical  findings 
is  the  failure  to  consider  the  characteristics  of  the  distribution.  Virtually  all 
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statistical  tests  have  certain  basic 
assumptions.  For  example,  para¬ 
metric  tests  (e.g.,  /-tests,  ANOVA, 
linear  regression)  require  that  the 
distribution  of  data  meet  certain 
requirements  (i.e.,  normality  and 
independence).  Failure  to  meet 
these  assumptions  may  cause  the 
results  of  an  analysis  to  be  inaccu¬ 
rate.  Although  statistics  such  as 
the  /-test  and  ANOVA  are  consid¬ 
ered  relatively  robust  (see  Rapid 
Reference  7.5)  in  terms  of  their 
sensitivity  to  normality,  this  is  less  true  for  the  assumption  of  indepen¬ 
dence.  For  example,  if  a  researcher  were  comparing  the  effect  of  two  dif¬ 
ferent  teachers’  methods  on  students’  final  grades,  the  researcher  would 
have  to  make  certain  that  none  of  the  students  had  classes  with  both 
teachers.  If  certain  students  had  classes  with  both  teachers,  and  were 
therefore  exposed  to  both  teaching  methods,  the  assumption  of  indepen¬ 
dence  would  have  been  violated.  Because  of  this,  probability  statements 
regarding  Type  I  and  Type  II  errors  may  be  seriously  affected. 

Another  aspect  of  the  distribution  that  should  be  considered  when  in¬ 
terpreting  study  findings  is  data  outliers.  As  discussed  earlier,  extreme  val¬ 
ues  in  the  distribution  can  substantially  skew  the  shape  of  the  distribution 
and  alter  the  sample  mean.  Researchers  should  carefully  examine  the  dis¬ 
tributions  of  their  data  to  identify  potential  outliers.  Once  identified,  out¬ 
liers  can  be  either  replaced  with  missing  values  or  transformed  through 
one  of  several  available  procedures  (discussed  previously  in  this  chapter). 

Still  another  aspect  of  the  distribution  that  should  be  considered  when 
analyzing  and  interpreting  data  is  the  range  of  values.  Researchers  often 
fail  to  find  significant  relationships  because  of  the  restricted  range  or  vari¬ 
ance  of  a  dependent  variable.  For  example,  suppose  you  were  examining 
the  relationship  between  IQ  and  SAT  scores,  but  everyone  in  the  sample 


— flap/d Reference  7.S 


Robustness  of 
Statistical  Tests 

Robustness  of  a  statistical  test 
refers  to  the  degree  to  which  it  is 
resistant  to  violations  of  certain 
assumptions. The  robustness  of 
certain  statistical  techniques  does 
not  mean  they  are  totally  immune 
to  such  violations,  but  merely  that 
they  are  less  sensitive  to  them. 
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scored  between  1100  and  1200  on  their  SATs.  In  this  case,  because  of  the 
restricted  range,  you  would  be  unlikely  to  find  a  significant  relationship, 
even  if  one  did  exist  in  the  population. 

Are  You  Fishing? 

Although  we  covered  the  issue  of  multiple  comparisons  and  experiment- 
wise  error  earlier  in  this  chapter,  it  deserves  additional  mention  here  be¬ 
cause  it  can  seriously  impact  the  interpretation  of  your  findings.  In  general, 
experiment-wise  error  refers  to  the  probability  of  committing  Type  I  errors 
for  a  set  of  statistical  tests  in  the  same  experiment.  When  you  make  many 
comparisons  involving  the  same  data,  the  probability  that  one  of  the  com¬ 
parisons  will  be  statistically  significant  increases.  Thus,  experiment- wise 
error  may  exceed  a  chosen  significance  level.  If  you  make  enough  com¬ 
parisons,  one  or  some  of  the  results  will  undoubtedly  be  significant.  Col¬ 
loquially,  this  is  often  referred  to  as  “fishing,”  because  if  you  cast  out  your 
line  enough  times  you  are  bound  to  catch  something.  Although  this  may 
be  a  good  strategy  for  anglers,  in  research  it  is  just  bad  science.  This  issue 
is  most  likely  to  occur  when  examining  complex  hypotheses  that  require 
many  different  comparisons.  Failing  to  correct  for  these  multiple  compar¬ 
isons  can  lead  to  substantial  Type  I  error  and  to  faulty  interpretations  of 
your  findings. 

How  Reliable  and  Valid  Are  Your  Measures? 

Another  major  factor  that  can  affect  a  study’s  findings  is  measurement  er¬ 
ror.  Although  most  statistical  analyses,  and  many  of  the  researchers  who 
conduct  them,  assume  that  assessment  instruments  are  error  free,  this  is 
usually  far  from  the  truth.  In  fact,  assessment  instruments  are  rarely,  if 
ever,  perfect  (see  Chapter  4  for  a  detailed  discussion  of  this  topic).  This 
is  particularly  true  when  using  unstandardized  measures  that  may  vary  in 
their  administration  procedures,  or  when  using  instruments  that  have  little 
if  any  demonstrated  validity  or  reliability  (see  Chapter  6).  For  these  rea¬ 
sons,  it  is  essential  that  researchers,  whenever  possible,  use  psychometri- 
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cally  sound  instruments  in  their  studies.  Using  error-laden  instruments 
may  substantially  reduce  the  sensitivity  of  your  analyses  and  obscure  oth¬ 
erwise  significant  findings. 

Statistical  Significance  vs.  Clinical  Significance 

Because  of  the  technical  and  detailed  nature  of  the  research  enterprise,  it 
is  often  easy  to  miss  the  forest  for  the  trees.  Researchers  can  get  so  caught 
up  in  the  rigor  of  data  collection,  management,  and  analysis  that  they  may 
wind  up  believing  that  the  final  value  of  a  research  study  lies  in  its p-v alue. 
This  is,  of  course,  far  from  the  truth.  The  real  value  of  a  research  finding 
lies  in  its  clinical  significance,  not  in  its  statistical  significance.  In  other  words, 
will  the  researching  findings  affect  how  things  are  done  in  the  real  world? 

This  is  not  to  say  that  statistical  significance  is  irrelevant.  On  the  con¬ 
trary,  statistical  significance  is  essential  in  determining  how  likely  a  result 
is  to  be  true  or  due  to  chance.  Before  we  can  decide  on  the  clinical  signif¬ 
icance  of  a  finding,  we  must  be  somewhat  certain  that  the  finding  is  indeed 
valid.  The  misperception  instead  lies  in  the  belief  that  statistical  signifi¬ 
cance  itself  is  meaningful.  In  fact,  study  results  can  be  statistically  signifi¬ 
cant,  but  clinically  meaningless. 

To  interpret  the  clinical  significance  of  their  findings,  researchers  might 
examine  a  number  of  other  indices,  such  as  the  effect  size  or  the  percent¬ 
age  of  participants  who  moved  from  outside  a  normal  range  to  within  a 
normal  range.  For  example,  a  study  may  reveal  that  two  different  studying 
methods  lead  to  significantly  different  test  scores,  but  that  neither  method 
results  in  passing  scores.  When  interpreting  research  findings,  researchers 
should  consider  not  only  the  statistical  significance,  but  its  clinical,  or  real- 
world,  importance. 

Are  There  Alternative  Explanations? 

As  we  discussed  in  Chapter  5,  the  key  element  in  true  experimental  re¬ 
search  is  scientific  control  and  the  ability  to  rule  out  alternative  explana¬ 
tions.  In  Chapter  5,  we  noted  that  randomization  is  the  best  way  to  achieve 
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this  type  of  control.  This  point  cannot  be  overemphasized.  Unless  you  can 
be  relatively  certain  that  there  are  no  systematic  differences  between  the 
experimental  groups  or  conditions,  and  that  the  only  thing  that  varies  is 
the  independent  variable  that  you  are  manipulating,  you  simply  cannot 
rule  out  other  potential  explanations  for  your  findings. 

Even  in  randomized  trials,  there  is  a  chance,  however  small,  that  there 
are  between-group  differences  on  variables  other  than  the  one  that  you  are 
manipulating.  The  wise  researcher  should  always  view  his  or  her  findings 
with  some  degree  of  suspicion  and  always  consider  alternative  explana¬ 
tions  for  those  findings.  It  is  this  critical  analysis  and  inability  to  be  easily 
convinced  that  distinguishes  true  scientific  endeavors  from  lesser  pursuits. 

Are  You  Confusing  Correlation  With  Causation? 

We  know  that  we  already  apologized  for  saying  this  too  often,  but  here  we 
go  again:  Correlation  is  not  causation,  period.  Significant  or  not,  hypothe¬ 
sized  or  not,  large-magnitude  associations  or  not,  simple  measures  of  as¬ 
sociation  should  never  be  interpreted  as  demonstrating  causal  relation¬ 
ships.  Where  would  we  be  if  we  accepted  such  faulty  logic?  We  would 
probably  be  in  a  society  that  believes  cold  temperatures  cause  colds,  or 
that  rock  music  leads  to  drug  abuse.  Okay,  so  maybe  we  are  not  always  so 
literal.  However,  the  thing  that  sets  scientists  apart  from  laypeople  (other 
than  our  low  incomes)  is  our  knowledge  of  the  scientific  method  and  our 
ability  to  discriminate  between  assumption  and  fact  (see  Chapter  1  for  a 
discussion  of  the  scientific  method). 

The  bottom  line  about  causality  is  that  it  cannot  be  inferred  without 
random  assignment.  In  other  words,  the  researcher  must  be  the  one  who 
selects  and  manipulates  the  independent  variables,  and  this  must  be  done 
prospectively.  If  this  is  not  the  case,  you  may  find  a  significant  association 
between  variables,  but  you  simply  cannot  infer  causation.  Importandy,  this 
is  true  regardless  of  the  statistical  tests  that  are  used.  It  does  not  matter 
whether  you  used  a  linear  regression,  an  ANOVA,  or  an  even  more  so¬ 
phisticated  statistical  technique.  Unless  randomization  and  control  are 
employed,  causation  cannot  be  inferred. 
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How  Significant  IsYour  Nonsignificance? 


The  last  point  that  we  want  to  cover  with  regard  to  the  interpretation  of 
study  results  is  the  issue  of  nonsignificance.  As  a  general  guideline,  re¬ 
searchers  should  not  be  overly  invested  in  finding  a  specific  outcome.  That 
is,  even  though  they  may  have  strong  rationales  for  hypothesizing  partic¬ 
ular  results,  they  should  not  place  all  their  hopes  on  having  their  studies 
turn  out  as  they  may  have  expected.  Not  only  could  such  an  approach  pre¬ 
cipitate  bias,  but  it  could  lead  to  a  common  misperception  among  research 
scientists — namely,  that  nonsignificant  results  are  not  useful.  On  the  con¬ 
trary,  nonsignificant  findings  can  be  as  important,  if  not  more  important, 
than  significant  ones. 

The  furtherance  of  science  depends  on  the  empirical  evaluation  of 
widely  held  assumptions  and  what  many  consider  to  be  common  sense. 
The  furtherance  of  science  also  depends  on  attempts  to  replicate  research 
findings  and  to  determine  whether  findings  found  in  one  population  gen¬ 
eralize  to  other  populations.  In  any  of  these  cases,  nonsignificant  findings 
can  have  some  very  significant  (important)  implications.  Therefore,  it  is 
strongly  recommended  that  researchers  be  as  neutral  and  objective  as  pos¬ 
sible  when  analyzing  and  interpreting  their  results.  In  many  cases,  less  may, 
in  fact,  be  more. 


SUMMARY 

In  this  chapter,  we  have  reviewed 
some  of  the  major  objectives  and 
techniques  involved  in  the  prepa¬ 
ration,  analysis,  and  interpretation 
of  study  data.  In  the  first  section, 
we  discussed  the  importance  of 
properly  logging  and  screening 
data,  designing  a  well-structured 
database  and  codebook,  and 
transforming  variables  into  an  ef- 
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Publication  Bias 

A  number  of  studies  (e.g.,  loanni- 
dis,  1 998;  Sterns  &  Simes,  1997) 
have  found  a  connection  between 
the  significance  of  a  study's  find¬ 
ings  and  its  publishibility.  Specifi¬ 
cally,  these  researchers  have  found 
that  a  greater  percentage  of  stud¬ 
ies  that  report  significant  findings 
wind  up  being  published  and  that 
there  are  also  greater  publication 
delays  for  such  studies. 
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ficient  and  analyzable  form.  In  the  second  section,  we  covered  the  two  pri¬ 
mary  categories  of  statistical  analyses — descriptive  and  inferential — and 
provided  a  brief  overview  of  several  of  the  most  widely  used  analytic  tech¬ 
niques.  In  the  last  section,  we  presented  a  wide  range  of  issues  that  re¬ 
searchers  should  consider  when  interpreting  their  research  findings. 
Specifically,  we  sought  to  express  the  potential  influence  that  issues  such 
as  power,  statistical  assumptions,  multiple  comparisons,  measurement  er¬ 
ror,  clinical  significance,  alternative  explanations,  and  inferences  about 
causality  can  have  on  the  way  that  you  interpret  your  data. 


JSfr  TEST  YOURSELF 


I .  A  written  or  computerized  record  that  provides  a  clear  and  comprehen' 
sive  description  of  all  variables  entered  into  a  database  is  known  as  a 


2.  _ statistics  are  generally  used  to  accurately  characterize  the 

data  collected  from  a  study  sample. 

3.  A  graph  that  illustrates  the  frequency  of  observations  by  groups  is  known 

as  a _ . 

4.  A  measure  of  the  spread  of  values  around  the  mean  of  a  distribution  is 

known  as  the _ . 

5.  Analysis  of  variance  (ANOVA)  is  used  to  measure  differences  in  group 


Answers:  I .  data  codebook;  2.  Descriptive;  3.  histogram;  4.  standard  deviation;  5.  means 
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ETHICAL  CONSIDERATIONS  IN  RESEARCH 


In  the  previous  chapters,  we  reviewed  many  of  the  methodological 
issues  that  should  be  considered  when  conducting  research.  We  dis¬ 
cussed  how  researchers  should  begin  their  research  endeavors  by  gen¬ 
erating  relevant  questions,  formulating  clear  and  testable  hypotheses,  and 
selecting  appropriate  and  practical  research  designs.  By  adhering  to  the 
scientific  method,  researchers  can,  in  due  course,  obtain  valid  and  reliable 
findings  that  may  advance  scientific  knowledge. 

Unavoidably,  however,  to  advance  knowledge  in  this  manner  it  is  often 
necessary  to  impinge  upon  the  rights  of  individuals.  Virtually  all  studies 
with  human  participants  involve  some  degree  of  risk.  These  risks  may 
range  from  minor  discomfort  or  embarrassment  caused  by  somewhat  in¬ 
trusive  or  provocative  questions  (e.g.,  questions  about  sexual  practices, 
drug  and  alcohol  use)  to  much  more  severe  effects  on  participants’  physi¬ 
cal  or  emotional  well-being.  These  risks  present  researchers  with  an  ethi¬ 
cal  dilemma  regarding  the  degree  to  which  participants  should  be  placed 
at  risk  in  the  name  of  scientific  progress. 

A  number  of  ethical  codes  have  been  developed  to  provide  guidance 
and  establish  principles  to  address  such  ethical  dilemmas.  These  codes  in¬ 
clude  federally  mandated  regulations  promulgated  by  the  U.S.  Depart¬ 
ment  of  Health  and  Human  Services  (Title  45,  Part  46  of  the  Code  of  Fed¬ 
eral  Regulations),  as  well  as  those  developed  for  specific  fields  of  study,  such 
as  the  APA’s  Ethical  Principles  of  Psychologists  and  Code  of  Conduct  (2002). 
These  codified  principles  are  intended  to  ensure  that  researchers  consider 
all  potential  risks  and  ethical  conflicts  when  designing  and  conducting  re- 
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search.  Moreover,  these  principles  are  intended  to  protect  research  partic¬ 
ipants  from  harm  (Sieber  &  Stanley,  1988). 

To  help  the  reader  better  contextualize  and  appreciate  the  importance 
of  the  protection  of  research  participants,  this  chapter  will  begin  by  re¬ 
viewing  the  historical  evolution  of  research  ethics.  We  will  then  discuss  the 
fundamental  ethical  principles  of  respect  for  persons,  beneficence,  and 
justice,  which  serve  as  the  foundation  for  the  formal  protection  of  re¬ 
search  participants.  Finally,  we  will  review  two  of  the  most  essential  pro¬ 
cesses  in  the  protection  of  research  participants:  informed  consent  and 
the  institutional  review  board.  The  purpose  of  this  chapter  is  to  familiar¬ 
ize  the  reader  with  some  of  the  most  common  ethical  issues  in  research 
with  human  participants,  and  it  should  not  be  considered  a  comprehen¬ 
sive  review  of  all  ethical  principles  and  regulatory  and  legal  guidelines  and 
requirements.  Before  researchers  undertake  any  study  involving  human 
participants,  they  should  consult  the  specific  rules  of  their  institutions,  the 
requirements  of  their  institutional  review  boards,  and  applicable  federal 
regulations,  including  Title  45,  Part  46  of  the  Code  of  Federal  Regulations. 

HISTORICAL  BACKGROUND 

Many  of  the  most  significant  medical  and  behavioral  advancements  of  the 
20th  century,  including  vaccines  for  diseases  such  as  smallpox  and  polio, 
required  years  of  research  and  testing,  much  of  which  was  done  with 
human  participants.  Regrettably,  however,  many  of  these  well-known  ad¬ 
vancements  have  somewhat  sinister  histories,  as  they  were  made  at  the  ex¬ 
pense  of  vulnerable  populations  such  as  inpatient  psychiatric  patients  and 
prisoners,  as  well  as  noninstitutionalized  minorities.  In  fact,  a  large  pro¬ 
portion  of  these  study  participants  were  involved  in  clinical  research  with¬ 
out  ever  being  informed.  Revelations  about  Nazi  medical  experiments  and 
unethical  studies  conducted  within  the  United  States  (e.g.,  the  Tuskegee 
Syphilis  Study — see  Rapid  Reference  8.1;  Milgram’s  Obedience  and  Indi¬ 
vidual  Responsibility  Study  [Milgram,  1974];  human  radiation  experi¬ 
ments)  heightened  public  awareness  about  the  potential  for  and  often 
tragic  consequences  of  research  misconduct. 
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The  Tuskegee  Syphilis  Study 

In  1 932,  the  U.S.  Public  Health  Service  began  a  40-year  longitudinal  study 
to  examine  the  natural  course  of  untreated  syphilis.  Four  hundred  Black 
men  living  inTuskegee,  Alabama,  who  had  syphilis  were  compared  to  200 
uninfected  men.  Participants  were  recruited  with  the  promise  that  they 
would  receive  “special  treatment"  for  their  “bad  blood.”  Horrifyingly,  gov¬ 
ernment  officials  went  to  extreme  lengths  to  ensure  that  the  participants 
in  fact  received  no  therapy  from  any  source. The  “special  treatment”  that 
was  promised  was  actually  very  painful  spinal  taps,  performed  without 
anesthesia — not  as  a  treatment,  but  merely  to  evaluate  the  neurological 
effects  of  syphilis.  Moreover;  even  though  penicillin  was  identified  as  an  ef¬ 
fective  treatment  for  syphilis  as  early  as  the  1 940s,  the  400  infected  men 
were  never  informed  about  or  treated  with  the  medication.  By  1 972, 
when  public  revelations  and  outcry  forced  the  government  to  end  the 
study,  only  74  of  the  original  400  infected  participants  were  still  alive.  Fur¬ 
ther  examination  revealed  that  somewhere  between  28  and  1 00  of  these 
participants  had  died  as  a  direct  result  of  their  infections. 


Over  the  past  half-century,  the  international  and  U.S.  medical  commu¬ 
nities  have  taken  a  number  of  steps  to  protect  individuals  who  participate 
in  research  studies.  Developed  in  response  to  the  Nuremberg  Trials  of 
Nazi  doctors  who  performed  unethical  experimentation  during  World 
War  II,  the  Nuremberg  Code  (see  Rapid  Reference  8.2)  was  the  first  ma¬ 
jor  international  document  to  provide  guidelines  on  research  ethics.  It 
made  voluntary  consent  a  requirement  in  clinical  research  studies,  empha¬ 
sizing  that  consent  can  be  voluntary  only  under  the  following  conditions: 

1.  Participants  are  able  to  consent. 

2.  They  are  free  from  coercion  (i.e.,  outside  pressure). 

3.  They  comprehend  the  risks  and  benefits  involved. 

The  Nuremberg  Code  also  clearly  requires  that  researchers  should  min¬ 
imize  risk  and  harm,  ensure  that  risks  do  not  significandy  outweigh  po¬ 
tential  benefits,  use  appropriate  study  designs,  and  guarantee  participants’ 
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The  Nuremberg  Code 

1 .  The  voluntary  consent  of  the  human  subject  is  absolutely  essential. 

2.  The  experiment  should  be  such  as  to  yield  fruitful  results  for  the 
good  of  society,  unprocurable  by  other  methods  or  means  of  study, 
and  not  random  and  unnecessary  in  nature. 

3.  The  experiment  should  be  so  designed  and  based  on  the  results  of 
animal  experimentation  and  a  knowledge  of  the  natural  history  of  the 
disease  or  other  problem  under  study,  that  the  anticipated  results  will 
justify  the  performance  of  the  experiment. 

4.  The  experiment  should  be  so  conducted  as  to  avoid  all  unnecessary 
physical  and  mental  suffering  and  injury. 

5.  No  experiment  should  be  conducted,  where  there  is  an  a  priori  rea¬ 
son  to  believe  that  death  or  disabling  injury  will  occur;  except,  per¬ 
haps,  in  those  experiments  where  the  experimental  physicians  also 
serve  as  subjects. 

6.  The  degree  of  risk  to  be  taken  should  never  exceed  that  determined 
by  the  humanitarian  importance  of  the  problem  to  be  solved  by  the 
experiment. 

7.  Proper  preparations  should  be  made  and  adequate  facilities  provided 
to  protect  the  experimental  subject  against  even  remote  possibilities 
of  injury,  disability,  or  death. 

8.  The  experiment  should  be  conducted  only  by  scientifically  qualified 
persons. The  highest  degree  of  skill  and  care  should  be  required 
through  all  stages  of  the  experiment  of  those  who  conduct  or  engage 
in  the  experiment. 

9.  During  the  course  of  the  experiment,  the  human  subject  should  be  at 
liberty  to  bring  the  experiment  to  an  end,  if  he  has  reached  the  physi¬ 
cal  or  mental  state,  where  continuation  of  the  experiment  seemed  to 
him  to  be  impossible. 

1 0.  During  the  course  of  the  experiment,  the  scientist  in  charge  must  be 
prepared  to  terminate  the  experiment  at  any  stage,  if  he  has  probable 
cause  to  believe,  in  the  exercise  of  the  good  faith,  superior  skill  and 
careful  judgment  required  of  him,  that  a  continuation  of  the  experi¬ 
ment  is  likely  to  result  in  injury,  disability,  or  death  to  the  experimental 
subject. 

Source.Trials  of  War  Criminals  Before  the  Nuremberg  Military  Tribunals  Under  Control  Council 

Law  No.  1 0.  (l949).Vol.  2,  pp.  I  8 1- 1  82.  Washington,  D.C.:  U.S.  Government  Printing  Office. 
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freedom  to  withdraw  at  any  time.  The  Nuremberg  Code  was  adopted  by 
the  United  Nations  General  Assembly  in  1948. 

The  next  major  development  in  the  protection  of  research  participants 
came  in  1964  at  the  18th  World  Medical  Assembly  in  Helsinki,  Finland. 
With  the  establishment  of  the  Helsinki  Declaration,  the  World  Medical 
Association  adopted  12  principles  to  guide  physicians  on  ethical  consid¬ 
erations  related  to  biomedical  research.  Among  its  many  contributions, 
the  declaration  helped  to  clarify  the  very  important  distinction  between 
medical  treatment,  which  is  provided  to  direcdy  benefit  the  patient,  and  med¬ 
ical  research,  which  may  or  may  not  provide  a  direct  benefit.  The  declaration 
also  recommended  that  human  biomedical  research  adhere  to  accepted 
scientific  principles  and  be  based  on  scientifically  valid  and  rigorous  labo¬ 
ratory  and  animal  experimentation,  as  well  as  on  a  thorough  knowledge  of 
scientific  literature.  These  guidelines  were  revised  at  subsequent  meetings 
in  1975,  1983,  and  1989. 

In  1974,  largely  in  response  to  the  Tuskegee  Syphilis  Study,  the  U.S. 
Congress  passed  the  National  Research  Act,  creating  the  National  Com¬ 
mission  for  the  Protection  of  Human  Subjects  of  Biomedical  and  Behav¬ 
ioral  Research.  The  National  Research  Act  led  to  the  development  of  in¬ 
stitutional  review  boards  (IRBs).  These  review  boards,  which  we  will  describe 
in  detail  later,  are  specific  human-subjects  committees  that  review  and  de¬ 
termine  the  ethicality  of  research.  The  National  Research  Act  required 
IRB  review  and  approval  of  all  federally  funded  research  involving  human 
participants.  The  Commission  was  responsible  for  (1)  identifying  the  eth¬ 
ical  principles  that  should  govern  research  involving  human  participants 
and  (2)  recommending  steps  to  improve  the  Regulations  for  the  Protec¬ 
tion  of  Human  Subjects. 

In  1979,  the  National  Commission  for  the  Protection  of  Human  Sub¬ 
jects  of  Biomedical  and  Behavioral  Research  issued  “The  Belmont  Report: 
Ethical  Principles  and  Guidelines  for  the  Protection  of  Human  Subjects 
of  Research.”  The  Belmont  Report  established  three  principles  that  un¬ 
derlie  the  ethical  conduct  of  all  research  conducted  with  human  partici¬ 
pants:  (1)  respect  for  persons,  (2)  beneficence,  and  (3)  justice  (see  Rapid 
Reference  8.3). 
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The  Belmont  Report:  Summary  of  Basic  Principles 

1 .  Respect  for  Persons 

Respect  for  persons  incorporates  at  least  two  ethical  convictions:  first, 
that  individuals  should  be  treated  as  autonomous  agents,  and  second,  that 
persons  with  diminished  autonomy  are  entitled  to  protection. The  prin¬ 
ciple  of  respect  for  persons  thus  divides  into  two  separate  moral  require¬ 
ments:  the  requirement  to  acknowledge  autonomy,  and  the  requirement 
to  protect  those  with  diminished  autonomy. 

2.  Beneficence 

Persons  are  treated  in  an  ethical  manner,  not  only  by  respecting  their  deci¬ 
sions  and  protecting  them  from  harm,  but  also  by  making  efforts  to  se¬ 
cure  their  well-being.  Such  treatment  falls  under  the  principle  of  benefi¬ 
cence. The  term  “beneficence”  is  often  understood  to  cover  acts  of 
kindness  or  charity  that  go  beyond  strict  obligation.  In  this  document, 
beneficence  is  understood  in  a  stronger  sense,  as  an  obligation. Two  gen¬ 
eral  rules  have  been  formulated  as  complementary  expressions  of  benefi¬ 
cent  actions  in  this  sense:  (I)  do  not  harm,  and  (2)  maximize  possible 
benefits,  and  minimize  possible  harms. 

3.  Justice 

Who  ought  to  receive  the  benefits  of  research  and  bear  its  burdens?  This 
is  a  question  of  justice,  in  the  sense  of'fairness  in  distribution”  or  “what  is 
deserved.”  An  injustice  occurs  when  some  benefit  to  which  a  person  is 
entitled  is  denied  without  good  reason,  or  when  some  burden  is  imposed 
unduly.  Another  way  of  conceiving  the  principle  of  justice  is  that  equals 
ought  to  be  treated  equally.  However;  this  statement  requires  explication. 
Who  is  equal  and  who  is  unequal?  What  considerations  justify  departure 
from  equal  distribution?  Almost  all  commentators  allow  that  distinctions 
based  on  experience,  age,  deprivation,  competence,  merit,  and  position 
do  sometimes  constitute  criteria  justifying  differential  treatment  for  cer¬ 
tain  purposes.  It  is  necessary,  then,  to  explain  in  what  respects  people 
should  be  treated  equally.There  are  several  widely  accepted  formulations 
of  just  ways  to  distribute  burdens  and  benefits.  Each  formulation  mentions 
some  relevant  property,  on  the  basis  of  which  burdens  and  benefits 
should  be  distributed. These  formulations  are  ( I )  to  each  person  an  equal 
share,  (2)  to  each  person  according  to  individual  need,  (3)  to  each  person 
according  to  individual  effort,  (4)  to  each  person  according  to  societal 
contribution,  and  (5)  to  each  person  according  to  merit. 
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The  Belmont  Report  explains  how  these  principles  apply  to  research 
practices.  For  example,  it  identifies  informed  consent  as  a  process  that  is 
essential  to  the  principle  of  respect.  In  response  to  the  Belmont  Report, 
both  the  U.S.  Department  of  Health  and  Human  Services  and  the  U.S. 
Food  and  Drug  Administration  revised  their  regulations  on  research  stud¬ 
ies  that  involve  human  participants. 

In  1994,  largely  in  response  to  information  about  1940s  experiments 
involving  the  injection  of  research  participants  with  plutonium  as  well  as 
other  radiation  experiments  conducted  on  indigent  patients  and  children 
with  mental  retardation  (see  Rapid  Reference  8.4),  President  Clinton  cre¬ 
ated  the  National  Bioethics  Advisory  Commission  (NBAC).  Since  its  in- 


^Rap/d Reference  &  / 


Human  Radiation  Experiments 

President  William  J.  Clinton  formed  the  Advisory  Committee  on  Human 
Radiation  Experiments  in  1 994  to  uncover  the  history  of  human  radia¬ 
tion  experiments.  According  to  the  committee’s  final  report,  several 
agencies  of  the  United  States  government,  including  the  Atomic  Energy 
Commission,  and  several  branches  of  the  military  services,  conducted  or 
sponsored  thousands  of  human  radiation  experiments  and  several  hun¬ 
dred  intentional  releases  of  radiation  between  the  years  of  1946  and 
1 974.  Among  the  committee’s  harshest  criticisms  was  that  physicians 
used  patients  without  their  consent  in  experiments  in  which  the  patients 
could  not  possibly  benefit  medically.The  principal  purpose  of  these  ex¬ 
periments  was  ostensibly  to  help  atomic  scientists  understand  the  poten¬ 
tial  dangers  of  nuclear  war  and  radiation  fallout.These  experiments  were 
conducted  in  “secret”  with  the  belief  that  this  was  necessary  to  protect 
national  security.The  committee  concluded  that  the  government  was 
responsible  for  failing  to  implement  many  of  its  own  protection  policies. 
The  committee  further  concluded  that  individual  researchers  failed  to 
comply  with  the  accepted  standards  of  professional  ethics.  In  October 
1 995,  after  receiving  the  committee’s  final  report,  President  Clinton  of¬ 
fered  a  public  apology  to  the  experimental  subjects,  and  in  March  1 997, 
he  agreed  to  provide  financial  compensation  to  all  of  the  individuals  who 
were  injured. 
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ception,  NBAC  has  generated  a  total  of  10  reports.  These  reports  have 
served  to  provide  advice  and  make  recommendations  to  the  National  Sci¬ 
ence  and  Technology  Council  and  to  other  government  entities,  and  to 
identify  broad  principles  to  govern  the  ethical  conduct  of  research. 

FUNDAMENTAL  ETHICAL  PRINCIPLES 

The  many  post-Nuremberg  efforts  just  reviewed  have  largely  defined  the 
philosophical  and  administrative  basis  for  most  existing  codes  of  research 
ethics.  Although  these  codes  may  differ  slightly  across  jurisdictions  and 
disciplines,  they  all  emphasize  the  protection  of  human  participants  and, 
as  outlined  in  the  Belmont  Report,  have  been  established  to  ensure  au¬ 
tonomy,  beneficence,  and  justice. 

Respect  for  Persons 

As  described  in  the  Belmont  Report,  “Respect  for  persons  incorporates  at 
least  two  ethical  mandates:  first,  that  individuals  be  treated  as  autonomous 
agents,  and  second,  that  individuals  with  diminished  autonomy  are  entitled 
to  protection”  (1979,  p.  4) .  The  concept  of  autonomy,  which  is  clearly  integral 
to  this  principle,  means  that  human  beings  have  the  right  to  decide  what 
they  want  to  do  and  to  make  their  own  decisions  about  the  kinds  of  research 
experiences  they  want  to  be  involved  in,  if  any.  In  cases  in  which  one’s  au¬ 
tonomy  is  diminished  due  to  cognitive  impairment,  illness,  or  age,  the  re¬ 
searcher  has  an  obligation  to  protect  the  individual’s  rights.  Respect  for  per¬ 
sons  therefore  serves  as  the  underlying  basis  for  what  might  be  considered 
the  most  fundamental  ethical  safeguard  underlying  research  with  human 
participants:  the  requirement  that  researchers  obtain  informed  consent 
from  individuals  who  freely  volunteer  to  participate  in  their  research. 

Coercion,  or  forcing  someone  to  participate  in  research,  is  antithetical 
to  the  idea  of  respect  for  persons  and  is  clearly  unethical.  Although  there 
are  many  safeguards  in  place  to  ensure  that  explicit  coercion  to  research, 
such  as  the  research  practiced  in  Nazi  concentration  camps,  is  no  longer 
likely,  there  are  still  many  situations  in  which  more  subtle  or  implicit  coer- 
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cion  may  take  place.  For  example,  consider  a  population  of  prison  inmates 
or  individuals  who  have  just  been  arrested.  If  they  are  asked  to  participate 
in  a  study,  is  it  coercive?  It  may  be,  if  the  prison  administrators,  judge,  or 
other  criminal  justice  staff  are  who  ask  them  to  participate,  or  if  the  dis¬ 
tinction  between  researchers  and  criminal  justice  staff  is  unclear.  In  such 
instances,  the  participants  may  feel  unduly  pressured  or  coerced  to  partic¬ 
ipate  in  the  study,  fearing  negative  repercussions  if  they  choose  to  decline. 
This  type  of  implicit  coercion  might  also  occur  in  any  situation  in  which 
the  participant  is  in  a  vulnerable  position  or  in  which  the  study  recruiter  or 
perceived  recruiter  is  in  a  position  of  power  or  authority  (e.g.,  teacher- 
student,  employer-employee). 

Importandy,  the  principle  of  respect  for  persons  does  not  mean  that 
potentially  vulnerable  or  coercible  populations  should  be  prevented  from 
participating  in  research.  On  the  contrary,  respect  for  persons  means  that 
these  individuals  should  have  every  right  to  participate  in  research  if  they 
so  choose.  The  main  point  is  that  these  individuals  should  be  able  to  make 
this  decision  autonomously.  For  these  reasons,  it  is  probably  good  practice 
for  researchers  to  maintain  clear  boundaries  between  themselves  and  per¬ 
sons  who  have  authority  over  prospective  research  participants. 

Beneficence 

Beneficence  means  being  kind,  or  a  charitable  act  or  gift.  In  the  research  con¬ 
text,  the  ethical  principle  of  beneficence  has  its  origins  in  the  famous  edict 
of  the  Hippocratic  Oath,  which  has  been  taken  by  physicians  since  ancient 
times:  “First,  do  no  harm.”  Above  all,  researchers  should  not  harm  their 
participants  and,  ultimately,  the  benefits  to  their  participants  should  be 
maximized  and  potential  harms  and  discomforts  should  be  minimized.  In 
conducting  research,  the  progress  of  science  should  not  come  at  the  price 
of  harm  to  research  participants.  For  example,  even  if  the  Tuskegee  ex¬ 
periments  had  resulted  in  important  information  on  the  course  of  syphilis 
(which  remains  unclear),  the  government  did  not  have  the  right  to  place 
individuals  at  risk  of  harm  and  death  to  obtain  this  information. 

Importantly,  the  edict  “do  no  harm”  is  probably  more  easily  adhered  to 
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in  clinical  practice  in  which  clinicians  employ  well-established  and  well- 
validated  procedures.  The  potential  risks  and  benefits  are  typically  less 
predictable  in  the  context  of  research  in  which  new  procedures  are  being 
tested.  This  poses  an  important  ethical  dilemma  for  researchers.  On  the 
one  hand,  the  researcher  may  have  a  firm  basis  for  believing  and  hypothe¬ 
sizing  that  a  specific  treatment  will  be  helpful  and  beneficial.  On  the  other 
hand,  because  it  has  not  yet  been  tested,  he  or  she  can  only  speculate  about 
the  potential  harm  and  side  effects  that  may  be  associated  with  the  treat¬ 
ment  or  intervention. 

To  determine  whether  a  research  protocol  has  an  acceptable  risk/ben¬ 
efit  ratio,  the  protocol  describing  all  aspects  of  the  research  and  potential 
alternatives  must  be  reviewed.  According  to  the  Belmont  Report,  there 
should  also  be  close  communication  between  the  IRB  and  the  researcher. 
The  IRB  should  (1)  determine  the  validity  of  the  assumptions  on  which 
the  research  is  based,  (2)  distinguish  the  nature  of  the  risk,  and  (3)  deter¬ 
mine  whether  the  researcher’s  estimates  of  the  probability  of  harm  or  ben¬ 
efits  are  reasonable. 

The  Belmont  Report  delineates  five  rules  that  should  be  followed  in  de¬ 
termining  the  risk/benefit  ratio  of  a  specific  research  endeavor  (National 
Commission  for  the  Protection  of  Human  Subjects  of  Biomedical  and 
Behavioral  Research,  1979,  p.  8): 

1 .  Brutal  or  inhumane  treatment  of  human  subjects  is  never 
morally  justified. 

2.  Risks  should  be  reduced  to  those  necessary  to  achieve  the  re¬ 
search  objective.  It  should  be  determined  whether  it  is  in  fact 
necessary  to  use  human  subjects  at  all.  Risk  can  perhaps  never 
be  entirely  eliminated,  but  it  can  often  be  reduced  by  careful 
attention  to  alternative  procedures. 

3.  When  research  involves  significant  risk  of  serious  impairment, 
review  committees  should  be  extraordinarily  insistent  on  the 
justification  of  the  risk  (looking  usually  to  the  likelihood  of  ben¬ 
efit  to  the  subject  or,  in  some  rare  cases,  to  the  manifest  volun¬ 
tariness  of  the  participation). 
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4.  When  vulnerable  populations  are  involved  in  research,  the  ap¬ 
propriateness  of  involving  them  should  itself  be  demonstrated. 
A  number  of  variables  go  into  such  judgments,  including  the  na¬ 
ture  and  degree  of  risk,  the  condition  of  the  particular  popula¬ 
tion  involved,  and  the  nature  and  level  of  the  anticipated  bene¬ 
fits. 

5.  Relevant  risks  and  benefits  must  be  thoroughly  arrayed  in  docu¬ 
ments  and  procedures  used  in  the  informed  consent  process. 


Justice 

The  principle  of  justice  relates  most  directly  to  the  researcher’s  selection 
of  research  participants.  According  to  the  Belmont  Report,  the  selection 
of  research  participants  must  be  the  result  of  fair  selection  procedures  and 
must  also  result  in  fair  selection  outcomes.  The  justness  of  participant  se¬ 
lection  relates  both  to  the  participant  as  an  individual  and  to  the  partici¬ 
pant  as  a  member  of  social,  racial,  sexual,  or  ethnic  groups.  Importantly, 
there  should  be  no  bias  or  discrimination  in  the  selection  and  recruitment 
of  research  participants.  In  other  words,  they  should  not  be  selected  be¬ 
cause  they  are  viewed  positively  or  negatively  by  the  researcher  (e.g.,  in¬ 
volving  so-called  undesirable  persons  in  risky  research) . 

In  addition  to  the  selection  of  research  participants,  the  principle  of  jus¬ 
tice  is  also  relevant  to  how  research  participants  are  treated,  or  not  treated. 
As  we  discussed  in  Chapter  5,  the  use  of  control  conditions  is  essential  to 
randomized,  controlled  studies,  which  is  the  only  true  method  to  confi¬ 
dently  evaluate  the  effectiveness  of  a  specific  treatment  or  intervention. 
The  dilemma  here  is  whether  it  is  ethical  or  just  to  assign  some  participants 
to  receive  a  potentially  helpful  intervention,  and  others  to  not  receive  it. 
Although  this  may  be  less  an  issue  in  certain  types  of  research,  it  is  a  criti¬ 
cal  issue  in  medical  studies  involving  treatment  for  debilitating  conditions, 
or  in  criminal  justice  or  social  policy  research  involving  potentially  life¬ 
changing  opportunities.  One  might  ask  why  the  researcher  could  not 
simply  ask  for  volunteers  for  the  control  condition.  The  answer  to  this 
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question  is  that  participants’  awareness  of  being  in  a  control  condition 
may  alter  the  results.  It  is  therefore  necessary  to  blind  the  participants  (i.e., 
to  keep  participants  unaware  of  their  experimental  assignments),  which 
raises  yet  another  potential  ethical  dilemma. 

Fortunately,  there  are  several  ways  to  address  these  ethical  concerns. 
First,  the  research  participants  must  be  clearly  informed  that  they  will  be 
randomly  assigned  to  either  an  experimental  condition  or  a  control  condi¬ 
tion,  and  they  should  also  be  informed  of  the  likelihood  (e.g.,  one  in  two, 
one  in  three)  of  being  assigned  to  one  condition  or  the  other.  Second,  the 
researcher  should  assure  participants  that  they  will  receive  full  disclosure 
regarding  their  assignment  following  the  completion  of  the  study,  and  the 
researcher  should  provide  the  opportunity  to  those  who  had  been  as¬ 
signed  to  the  control  condition  to  receive  the  experimental  treatment  if  it 
is  shown  to  be  effective. 

DOIT’T  FORGET 


Confidentiality 

The  right  to  confidentiality  is  embodied  in  the  principles  of  respect  for 
persons,  beneficence,  and  justice.  Generally,  confidentiality  involves  both 
an  individual’s  right  to  have  control  overthe  use  or  access  of  his  or  her 
personal  information  as  well  as  the  right  to  have  the  information  that  he 
or  she  shares  with  the  research  team  kept  private. The  researcher  is 
responsible  not  only  for  maintaining  the  confidentiality  of  all  information 
protected  by  law,  but  also  for  information  that  might  affect  the  privacy 
and  dignity  of  research  participants.  During  the  consent  process,  the  re¬ 
searcher  must  clearly  explain  all  issues  related  to  confidentiality,  including 
who  will  have  access  to  their  information,  the  limits  of  confidentiality,  risks 
related  to  potential  breaches  of  confidentiality,  and  safeguards  designed  to 
protect  their  confidentiality  (e.g.,  plans  for  data  transfer;  data  storage,  and 
recoding  and  purging  data  of  client  identifiers).  Researchers  should  be 
aware  of  the  serious  effects  that  breaches  in  confidentiality  could  have  on 
the  research  participants,  and  employ  every  safeguard  to  prevent  such 
violations,  including  careful  planning  and  training  of  research  staff  Re¬ 
searchers  should  also  familiarize  themselves  with  all  applicable  institu¬ 
tional,  local,  state,  and  federal  regulations  governing  their  research. 
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— flap/d Reference  85 


Federal  Research  Protections 

There  are  two  primary  categories  of  federal  research  protections  for  hu¬ 
man  participants. The  first  is  provided  in  the  Federal  Policy  forthe  Protec¬ 
tion  of  Human  Subjects,  also  known  as  the  Common  Rule. The  Common 
Rule  is  a  set  of  regulations  adopted  independently  by  1 7  federal  agencies 
that  support  or  conduct  research  with  human  research  participants. The 
1 7  agencies  adopted  regulations  based  on  the  language  set  forth  in  Title 
45,  Part  46,  Subpart  A,  of  the  Code  of  Federal  Regulations  (CFRj.Thus,  the 
Common  Rule  is,  for  most  intents  and  purposes,  Subpart  A  of  the  De¬ 
partment  of  Health  and  Human  Services’  regulations.The  second  cate¬ 
gory  of  federal  protections  that  relates  to  human  research  participants  is 
the  set  of  rules  governing  drug,  device,  and  biologies  research. These  rules 
are  administered  by  the  U.S.  Food  and  Drug  Administration  (FDA). 
Specifically,  the  FDA  regulates  research  involving  products  regulated  by 
the  FDA,  including  research  and  marketing  permits  for  drugs,  biological 
products,  and  medical  devices  for  human  use,  regardless  of  whether  fed¬ 
eral  funds  are  used. 


To  ensure  that  the  basic  tenets  of  the  Belmont  Report  were  adhered  to, 
the  federal  government,  through  the  Department  of  Health  and  Human 
Services,  codified  a  set  of  research-related  regulations.  Known  as  45  CFR 
46,  indicating  the  specific  Title  45  and  Part  46  of  the  Code  of  Federal  Regula¬ 
tions,  the  document  details  the  regulations  that  must  be  observed  when 
conducting  research  with  human  participants  (see  Rapid  Reference  8.5). 
In  general,  the  federal  regulations  focus  on  two  main  areas  that  are  inte¬ 
gral  to  the  protection  of  human  participants:  informed  consent  and  insti¬ 
tutional  review  boards. 

INFORMED  CONSENT 

The  principle  mechanism  for  describing  the  research  study  to  potential 
participants  and  providing  them  with  the  opportunity  to  make  au¬ 
tonomous  and  informed  decisions  regarding  whether  to  participate  is  in- 

tehm  LinG  -  live,  informative.  Non-cost  and  Genuine  \ 


246  ESSENTIALS  OF  RESEARCH  DESIGN  AND  METHODOLOGY 


formed  consent.  For  this  reason,  informed  consent  has  been  characterized  as 
the  cornerstone  of  human  rights  protections.  The  three  basic  elements  of 
informed  consent  are  that  it  must  be  (1)  competent,  (2)  knowing,  and  (3) 
voluntary.  Notably,  each  of  these  three  prongs  may  be  conceptualized  as 
having  its  own  unique  source  of  vulnerability.  In  the  context  of  research, 
these  potential  vulnerabilities  may  be  conceptualized  as  stemming  from 
sources  that  may  be  intrinsic,  extrinsic,  or  relational  (Roberts  &  Roberts, 
1999): 

1 .  Intrinsic  vulnerabilities  are  personal  characteristics  that  may  limit  an 
individual’s  capacities  or  freedoms.  For  instance,  an  individual 
who  is  under  the  influence  of  a  psychoactive  substance  or  is  ac¬ 
tively  psychotic  might  have  difficulty  comprehending  or  attend¬ 
ing  to  consent  information.  Such  vulnerabilities  relate  to  the 
first  prong  of  informed  consent,  that  of  competence  (also  re¬ 
ferred  to  in  the  literature  as  “decisional  capacity”).  Many  theo¬ 
rists  have  broadly  conceptualized  competence  to  include  such 
functions  as  understanding,  appreciation,  reasoning,  and  ex¬ 
pressing  a  choice  (Appelbaum  &  Grisso,  2001).  However,  these 
functions  are  directly  related  to  the  legal  and  ethical  concept  of 
competence  only  insofar  as  they  refer  to  an  individual’s  intrinsic 
capability  to  engage  in  these  functions. 

2.  Extrinsic  vulnerabili  ties  are  situational factors  that  may  limit  the  ca¬ 
pacities  or  freedoms  of  the  individual.  For  example,  an  individ¬ 
ual  who  has  just  been  arrested  or  who  is  facing  sentencing  may 
be  too  anxious  or  confused,  or  may  be  subject  to  implicit  or  ex¬ 
plicit  coercion  to  provide  voluntary  and  informed  consent.  Such 
extrinsic  vulnerabilities  may  relate  either  to  knowingness  or  to 
voluntariness  to  the  degree  that  the  situation,  not  the  individ¬ 
ual’s  capacity,  prevents  him  or  her  from  making  an  informed 
and  autonomous  decision. 

3.  Relational  vulnerabilities  occur  as  a  result  of  a  relationship  with 
another  individual  or  set  of  individuals.  For  example,  a  prisoner 
who  is  asked  by  the  warden  to  participate  in  research  is  unlikely 
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to  feel  free  to  decline.  Similarly,  a  terminally  ill  person  recruited 
into  a  study  by  a  caregiver  may  confuse  the  caregiving  and  re¬ 
search  roles.  Relational  vulnerabilities  typically  relate  to  the  third 
prong  of  the  informed  consent  process,  voluntariness.  Certain 
relationships  may  be  implicitly  coercive  or  manipulative  because 
they  may  unduly  influence  the  individual’s  decision. 

Competence 

The  presence  of  cognitive  impairment  or  limited  understanding  does  not 
automatically  disqualify  individuals  from  consenting  or  assenting  to  re¬ 
search  studies.  As  discussed,  the  principle  of  respect  for  persons  asserts 
that  these  individuals  should  have  every  right  to  participate  in  research  if 
they  so  choose.  According  to  federal  regulations  (45  CFR  §  46.1  ll[b]), 
“When  some  or  all  of  the  subjects  are  likely  to  be  vulnerable  to  coercion 
or  undue  influence,  such  as  children,  prisoners,  pregnant  women,  mentally 
disabled  persons,  or  economically  or  educationally  disadvantaged  persons, 
additional  safeguards  have  been  included  in  the  study  to  protect  the  rights 
and  welfare  of  these  subjects.”  Therefore,  the  critical  issue  is  not  whether 
they  should  be  allowed  to  participate,  but  whether  their  condition  leads  to 
an  impaired  decisional  capacity. 

To  our  knowledge,  there  has  been  only  one  instrument  developed 
specifically  for  this  purpose,  the  MacArthur  Competence  Assessment 
Tool  for  Clinical  Research  (Appelbaum  &  Grisso,  2001).  Developed  by 
two  of  the  leading  authorities  in  consent  and  research  ethics,  the  instru¬ 
ment  provides  a  semistructured  interview  format  that  can  be  tailored  to 
specific  research  protocols  and  used  to  assess  and  rate  the  abilities  of  po¬ 
tential  research  participants  in  four  areas  that  represent  part  of  the  stan¬ 
dard  of  competence  to  consent  in  many  jurisdictions.  The  instrument 
helps  to  determine  the  degree  to  which  potential  participants  (1)  under¬ 
stand  the  nature  of  the  research  and  its  procedures;  (2)  appreciate  the  con¬ 
sequences  of  participation;  (3)  show  the  ability  to  consider  alternatives,  in¬ 
cluding  the  option  not  to  participate;  and  (4)  show  the  ability  to  make  a 
reasoned  choice.  Although  this  instrument  appears  to  be  appropriate  for 
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assessing  competence,  researchers  should  make  certain  to  carefully  con¬ 
sult  local  and  institutional  regulations  before  relying  solely  on  this  type  of 
instrument.  Depending  on  the  specific  condition  of  the  potential  partici¬ 
pants,  researchers  may  want  to  engage  the  services  of  a  specialist  (e.g.,  a 
neurologist,  child  psychologist)  when  making  competence  determina¬ 
tions. 

Importantly,  researchers  should  not  mistakenly  interpret  potential  par¬ 
ticipants’  attentiveness  and  agreeable  comments  or  behavior  as  evidence 
of  their  competence  because  many  cognitively  impaired  persons  retain  at¬ 
tentiveness  and  social  skills.  Similarly,  performance  on  brief  mental  status 
exams  should  not  be  considered  sufficient  to  determine  competence,  al¬ 
though  such  information  may  be  helpful  in  combination  with  other  com¬ 
petence  measures. 

If  the  potential  research  participant  is  determined  to  be  competent  to 
provide  consent,  the  researcher  should  obtain  the  participant’s  informed 
consent.  If  the  potential  participant  is  not  sufficiently  competent,  in¬ 
formed  consent  should  be  obtained  from  his  or  her  caregiver  or  surrogate 
and  assent  should  be  obtained  from  the  participant. 

Knowingness 

It  is  still  not  clear  whether  many  research  participants  actually  participate 
knowledgeably  in  decision  making  about  their  research  involvement.  In 
fact,  evidence  suggests  that  participants  in  clinical  research  often  fail  to 
understand  or  remember  much  of  the  information  provided  in  consent 
documents,  including  information  relevant  to  their  autonomy,  such  as  the 
voluntary  nature  of  participation  and  their  right  to  withdraw  from  the 
study  at  any  time  without  negative  repercussions. 

Problems  with  the  understanding  of  both  research  and  treatment  pro¬ 
tocols  have  been  widely  reported  (e.g.,  Dunn  &  Jeste,  2001).  Studies  indi¬ 
cate  that  research  participants  often  lack  awareness  of  being  participants 
in  a  research  study,  have  poor  recall  of  study  information,  have  inadequate 
recall  of  important  risks  of  the  procedures  or  treatments,  lack  under¬ 
standing  of  randomization  procedures  and  placebo  treatments,  lack 
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The  Therapeutic  Misconception 

The  therapeutic  misconception  occurs  when  research  participants  confuse 
general  intentions  of  research  with  those  of  treatment,  or  the  role  of  re¬ 
searchers  with  the  role  of  clinicians.This  misconception  refers  specifically 
to  the  mistaken  belief  that  the  principle  of  personal  care  applies  even  in 
research  settings. This  may  also  be  seen  as  a  sort  of' ‘white-coat  phenome¬ 
non,”  in  which,  as  a  result  of  their  learning  history,  individuals  may  hold  on 
to  the  mistaken  belief  that  any  doctor  or  professional  has  only  their  best 
interests  in  mind. This  may  compromise  their  ability  to  accurately  weigh 
the  potential  risks  and  benefits  of  participating  in  a  particular  study. 


awareness  of  the  ability  to  withdraw  from  the  research  study  at  any  time, 
and  are  often  confused  about  the  dual  roles  of  clinician  versus  researcher 
(Appelbaum,  Roth,  &  Lidz,  1982;  Cassileth,  Zupkis,  Sutton-Smith,  & 
March,  1980;  Sugarman,  McCrory,  &  Hubal,  1998). 

A  number  of  client  variables  are  associated  with  the  understanding  of 
consent  information.  Several  studies  (e.g.,  Aaronson  et  al.,  1996;  Agre, 
Kurtz,  &  Krauss,  1994;  Bjorn  &  Holm,  1999)  found  educational  and  vo¬ 
cabulary  levels  to  be  significandy  and  positively  correlated  with  measures 
of  understanding  of  consent  information.  Although  age  alone  has  notbeen 
consistendy  associated  with  diminished  performance  on  consent  quizzes, 
it  does  appear  to  interact  with  education  in  that  older  individuals  with  less 
education  display  decreased  understanding  of  consent  information  (Taub, 
Baker,  Kline,  &  Sturr,  1987). 

Drug  and  alcohol  abusers  may  present  a  unique  set  of  difficulties  in 
terms  of  their  comprehension  and  retention  of  consent  information,  not 
only  because  of  the  mental  and  physical  reactions  to  the  psychoactive  sub¬ 
stances,  but  also  because  of  the  variety  of  conditions  that  are  comorbid 
with  substance  abuse  (McCrady  &  Bux,  1999).  Acute  drug  intoxication  or 
withdrawal  can  impair  attention,  cognition,  or  retention  of  important  in¬ 
formation  (e.g.,  Tapert  &  Brown,  2000).  Limited  educational  opportuni¬ 
ties,  chronic  brain  changes  resulting  from  long-term  drug  or  alcohol  use, 
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prior  head  trauma,  poor  nutrition,  and  comorbid  health  problems  (e.g., 
AIDS-related  dementia)  are  common  in  individuals  with  substance  abuse 
or  dependence  diagnoses  and  may  also  reduce  concentration  and  limit  un¬ 
derstanding  during  the  informed  consent  process  (McCrady  &  Bux). 

Although  the  number  of  articles  published  on  informed  consent  has  in¬ 
creased  steadily  over  the  past  30  years  (Kaufmann,  1983;  Sugarman  et  al., 
1999),  the  number  of  studies  that  have  actually  tested  methods  for  im¬ 
proving  the  informed  consent  process  is  quite  limited.  In  their  2001  ar¬ 
ticle,  Dunn  and  Jeste  reviewed  a  total  of  34  experimental  studies  that  had 
examined  the  effects  of  interventions  designed  to  increase  understanding 
of  informed  consent  information.  Of  the  34  studies  reviewed,  25  found 
that  participants’  understanding  or  recall  showed  improvement  using  a 
limited  array  of  interventions.  The  strategies  that  have  proven  most  suc¬ 
cessful  fall  into  two  broad  categories:  (1)  those  focusing  on  the  structure  of 
the  consent  document,  and  (2)  those  focusing  on  the  process  of  presenting 
consent  information.  Successful  strategies  directed  toward  the  structure 
of  the  consent  form  involved  the  use  of  forms  that  were  more  highly  struc¬ 
tured,  better  organized,  shorter,  and  more  readable,  and  that  used  simpli¬ 
fied  and  illustrated  formats.  Successful  strategies  involving  the  consent 
process  included  corrected  feedback  and  multiple  learning  trials,  and  the 
use  of  summaries  of  consent  information.  Other  efforts  that  were  gener¬ 
ally  not  successful  or  that  showed  mixed  results  included  the  use  of  video¬ 
tape  methodologies  and  the  use  of  highly  detailed  consent  information, 
which  were  not  associated  with  improved  understanding  in  either  a  re¬ 
search  or  clinical  context. 

Other  strategies  have  been  shown  to  help  individuals  remember  con¬ 
sent  information  beyond  the  initial  testing  period.  This  has  specific  im¬ 
portance  in  that  it  speaks  to  the  ability  of  research  participants  to  retain 
information  related  to  (1)  their  right  to  withdraw  from  the  research  study 
at  any  time  with  no  negative  consequences,  (2)  procedures  for  contacting 
designated  individuals  in  the  occasion  of  an  adverse  event,  and  (3)  proce¬ 
dures  for  obtaining  compensation  for  harm  or  injury  incurred  as  a  result 
of  study  participation.  Successful  strategies  for  improving  recall  of  con- 
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sent  information  have  included  making  postconsent  telephone  contacts, 
using  simplified  and  illustrated  presentations,  and  providing  corrected 
feedback  and  multiple  learning  trials.  Still,  there  is  much  room  for  im¬ 
provement  and  research  should  continue  to  explore  methods  of  improv¬ 
ing  participants’  comprehension  and  retention  of  consent  information. 

Voluntariness 

The  issue  of  whether  consent  is  voluntary  is  of  particular  importance 
when  conducting  research  with  disenfranchised  and  vulnerable  popula¬ 
tions,  such  as  individuals  involved  with  the  criminal  justice  system.  These 
populations  are  regularly  exposed  to  implicit  and  explicit  threats  of  coer¬ 
cion,  deceit,  and  other  kinds  of  overreaching  that  may  jeopardize  the  ele¬ 
ment  of  voluntariness.  In  particular,  there  is  a  substantial  risk  that,  as  a  re¬ 
sult  of  their  current  situation,  they  may  become  convinced,  righdy  or 
wrongly,  that  their  future  depends  on  cooperating  with  authorities.  This 
source  of  vulnerability  is  very  different  from  knowingness  or  competence, 
because  even  the  most  informed  and  capable  individual  may  not  be  able  to 
make  a  truly  autonomous  decision  if  he  or  she  is  exposed  to  a  potentially 
coercive  or  compromising  situation. 

Despite  the  obvious  importance  of  this  central  element  of  informed 
consent,  virtually  no  studies  have  examined  potential  methods  for  de¬ 
creasing  coercion  in  research.  McGrady  and  Bux  (1 999)  surveyed  a  sample 
of  researchers  funded  by  the  National  Institutes  of  Health  who  were  cur- 
rendy  recruiting  participants  from  settings  considered  to  be  implicidy 
coercible  (e.g.,  inpatient  units,  detoxification  facilities,  prisons).  The  re¬ 
searchers  were  surveyed  about  the  types  of  procedures  they  used  to  ensure 
that  participants  were  free  from  coercion.  Among  the  most  commonly  re¬ 
ported  protections  were  (1)  discussing  with  participants  the  possibility  of 
feeling  coerced,  (2)  obtaining  consent  from  the  individuals  responsible  for 
the  participants,  (3)  changing  the  compensation  to  prevent  the  coercive  ef¬ 
fects  of  monetary  incentives,  (4)  making  clear  that  treatment  is  not  influ¬ 
enced  by  participation  in  research,  (5)  reminding  participants  that  partici- 
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pation  is  voluntary,  (6)  having  participants  delay  consent  to  think  about 
participation,  and  (7)  providing  a  clear  list  of  treatment  options  as  an  al¬ 
ternative  to  research. 

Developing  a  Consent  Form 

Given  the  importance  of  informed  consent  and  the  many  problems  re¬ 
garding  its  comprehension  and  retention,  researchers  should  be  careful  to 
provide  consent  information  to  potential  research  participants  or  their 
representatives  in  language  that  is  understandable  and  clear.  Typically,  in¬ 
formed  consent  must  be  documented  by  the  use  of  a  written  consent  form 
approved  by  the  IRB  and  signed  by  the  participant  or  the  participant’s 
legally  authorized  representative,  as  well  as  a  witness.  One  copy  should 
then  be  given  to  the  individual  signing  the  form  and  another  copy  should 
be  kept  by  the  researcher.  The  basic  elements  of  a  consent  form  include 
each  of  the  following: 

1.  An  explanation  of  the  purpose  of  the  study,  the  number  of  par¬ 
ticipants  that  will  be  recruited,  the  reason  that  they  were  se¬ 
lected,  the  amount  of  time  that  they  will  be  involved,  their  re¬ 
sponsibilities,  and  all  experimental  procedures. 

2.  A  description  of  any  potential  risks  to  the  participant. 

3.  A  description  of  any  potential  benefits  to  the  participant  or  to 
others  that  may  reasonably  be  expected  from  the  research. 

4.  A  description  of  alternative  procedures  or  interventions,  if  any, 
that  are  available  and  that  may  be  advantageous  to  the  participant. 

5.  A  statement  describing  the  extent,  if  any,  to  which  confiden¬ 
tiality  of  records  identifying  the  participant  will  be  maintained. 

6.  For  research  involving  more  than  minimal  risk,  an  explanation  as 
to  whether  any  compensation  will  be  provided  and  whether  any 
medical  treatments  are  available  if  injury  occurs  and,  if  so,  what 
they  consist  of,  or  where  further  information  may  be  obtained. 

7.  Information  about  who  can  be  contacted  in  the  event  that  par- 
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ticipants  require  additional  information  about  their  rights  or 
specific  study  procedures,  or  in  the  event  of  a  research-related 
injury  or  adverse  event.  The  document  should  provide  the 
names  and  contact  information  for  specific  individuals  who 
should  be  contacted  for  each  of  these  concerns.  Many  IRBs  re¬ 
quire  that  a  consent  form  include  a  contact  person  not  directly 
affiliated  with  the  research  project,  for  questions  or  concerns 
related  to  research  rights  and  potential  harm  or  injury. 

8.  A  clear  statement  explaining  that  participation  is  completely 
voluntary  and  that  refusal  to  participate  will  involve  no  penalty 
or  loss  of  benefits  to  which  the  participant  is  otherwise  enti¬ 
tled. 

9.  A  description  of  circumstances  under  which  the  study  may  be 
terminated  (e.g.,  loss  of  funding). 

10.  A  statement  that  any  new  findings  discovered  during  the 

course  of  the  research  that  may  relate  to  the  participant’s  will¬ 
ingness  to  continue  participation  will  be  provided  to  the  partic¬ 
ipant. 

Under  federal  regulations  contained  in  45  CFR§  46.1 16(d),  anIRB  may 
approve  a  waiver  or  alteration  of  informed  consent  requirements  whenever 
it  finds  and  documents  all  of  the  following: 

1 .  The  research  involves  no  more  than  minimal  risk  to  participants. 

2.  The  waiver  or  alteration  will  not  adversely  affect  the  rights  and  welfare 
of  participants. 

3.  The  research  could  not  practicably  be  carried  o/tf  without  the  waiver 
or  alteration. 

4.  Where  appropriate,  the  participants  will  be  provided  with  addi¬ 
tional  pertinent  information  after  participation. 

The  IRB  may  also  approve  a  waiver  of  the  requirement  for  written  doc¬ 
umentation  of  informed  consent  under  limited  circumstances  described  at 
45  CFR§  46.117(c). 
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INSTITUTIONAL  REVIEW  BOARDS 

All  research  with  human  participants  in  the  United  States  is  regulated  by 
institutional  review  boards  (IRBs).  As  mentioned  earlier,  before  any  re¬ 
search  study  can  be  conducted,  the  researcher  must  have  the  procedures 
approved  by  an  IRB. 

IRBs  are  formed  by  academic,  research,  and  other  institutions  to  pro¬ 
tect  the  rights  of  research  participants  who  are  participating  in  studies  be¬ 
ing  conducted  under  the  jurisdiction  of  the  IRBs.  IRBs  have  the  authority 
to  approve,  require  modifications  of,  or  disapprove  all  research  activities 
that  fall  within  their  jurisdiction  as  specified  by  both  the  federal  regula¬ 
tions  and  local  institutional  policy.  Researchers  are  responsible  for  com¬ 
plying  with  all  IRB  decisions,  conditions,  and  requirements. 

Researchers  planning  to  conduct  research  studies  must  begin  by 
preparing  written  research  protocols  that  provide  complete  descriptions 
of  the  proposed  research  (see  Rapid  Reference  8.6).  The  protocol  should 
include  detailed  plans  for  the  protection  of  the  rights  and  welfare  of 
prospective  research  participants  and  make  certain  that  all  relevant  laws 
and  regulations  are  observed.  Once  the  written  protocol  is  completed,  it  is 
sent  to  the  appropriate  IRB  along  with  a  copy  of  the  consent  form  and  any 
additional  materials  (e.g.,  test  materials,  questionnaires) .  The  IRB  will  then 
review  the  protocol  and  related  materials. 

According  to  45  CFR  §  46.107,  IRBs  must  have  at  least  five  members,  in¬ 
cluding  the  IRB  chairperson,  although  most  have  far  more.  IRBs  should  be 
made  up  of  individuals  of  varying  disciplines  and  backgrounds.  This  het¬ 
erogeneity  is  necessary  to  ensure  that  research  protocols  are  reviewed  from 
many  different  perspectives.  This  includes  having  researchers,  laypeople, 
individuals  from  different  disciplines,  and  so  on.  For  example,  an  IRB  may 
include  scientists  and/or  methodologists  who  are  familiar  with  research 
and  statistical  issues;  social  workers  who  are  familiar  with  social,  familial, 
and  support  issues;  physicians  and  psychologists  who  are  familiar  with 
physical  and  emotional  concerns;  lawyers  who  can  address  legal  issues;  and 
clergy  who  can  address  spiritual  and  community  issues.  And  when  proto¬ 
cols  involve  vulnerable  populations,  such  as  children,  prisoners,  pregnant 
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— flap/d Reference  86 


IRB  Review:  Protocol  Submission  Overview 

1 .  Introduction  and  rationale  for  study. 

2.  Specific  aim(s). 

3.  Outcomes  to  be  measured. 

4.  Number  of  participants  to  be  enrolled  per  year  and  in  total. 

5.  Considerations  of  statistical  power  in  relation  to  enrollment. 

6.  Study  procedures. 

7.  Identification  of  the  sources  of  research  material  obtained  from  indi¬ 
vidually  identifiable  living  human  participants  in  the  form  of  speci¬ 
mens,  records,  or  data. 

8.  Sample  characteristics  (i.e.,  anticipated  number;  ages,  gender;  ethnic 
background,  and  health  status).  Inclusion  and  exclusion  criteria.  Ratio¬ 
nale  for  use  of  vulnerable  populations  (i.e.,  prisoners,  pregnant  women, 
disabled  persons,  drug  users,  children)  as  research  participants. 

9.  Recruitment  procedures,  nature  of  information  to  be  provided  to 
prospective  participants,  and  the  methods  of  documenting  consent. 

1 0.  Potential  risks  and  benefits  of  participation.  (Are  the  risks  to  partici¬ 
pants  reasonable  in  relation  to  the  anticipated  benefits  to  participants 
and  in  relation  to  the  importance  of  the  knowledge  that  may  reason¬ 
ably  be  expected  to  result  from  the  research?) 

I  I .  Procedures  for  protecting  against  or  minimizing  potential  risks.  Plans 
for  data  safety  monitoring  and  addressing  adverse  events  if  they  oc¬ 
cur  Alternative  interventions  and  procedures  that  might  be  advanta¬ 
geous  to  the  participants. 

1 2.  Inclusion  of  or  rationale  for  excluding  children  (rationale  to  be  based 
on  specific  regulations  outlined  in  45  CFR§  46). 


women,  or  handicapped  or  mentally  disabled  persons,  the  IRB  must  con¬ 
sider  the  inclusion  of  one  or  more  individuals  who  are  knowledgeable 
about  and  experienced  in  working  with  these  potential  participants. 

In  addition  to  their  diversity  and  professional  competence,  IRBs  must 
have  a  clear  understanding  of  federal  and  institutional  regulations  so  that 
they  can  determine  whether  the  proposed  research  is  in  line  with  institu- 
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tional  regulations,  applicable  law,  and  standards  of  professional  conduct 
and  practice.  Importantly,  IRBs  are  required  to  have  at  least  one  member 
who  has  no  affiliation  with  the  institution  (even  through  an  immediate 
family  member).  Finally,  the  IRB  must  make  every  effort  to  ensure  that  it 
does  not  consist  entirely  of  men  or  entirely  of  women,  although  selections 
cannot  be  made  on  the  basis  of  gender. 

One  of  the  initial  questions  an  IRB  must  ask  when  reviewing  a  research 
protocol  is  whether  that  IRB  has  jurisdiction  over  the  research.  That  is,  the 
IRB  must  ask,  “Is  the  research  subject  to  IRB  review?”  To  answer  this 
question,  the  IRB  must  determine  (1)  whether  the  activity  involves  research 
and  (2)  whether  it  involves  human  participants.  Research  is  defined  by  the  fed¬ 
eral  regulations  as  “a  systematic  investigation,  including  research  develop¬ 
ment,  testing  and  evaluation,  designed  to  develop  or  contribute  to  gener- 
alizable  knowledge”  (45  CFR  §  46.1 02[d]).  Human  participants  are  defined 
by  the  regulations  as  “living  individual(s)  about  whom  an  investigator 
(whether  professional  or  student)  conducting  research  obtains  (1)  data 
through  intervention  or  interaction  with  the  individual,  or  (2)  identifiable 
private  information”  (45  CFR  §  46. 1 07[f]). 

Some  types  of  research  involving  human  participants  may  be  exempt 
from  IRB  review  (45  CFR  §  46.101  [b]).  These  include  certain  types  of  ed¬ 
ucational  testing  and  surveys  for  which  no  identifying  information  is  col¬ 
lected  or  recorded.  In  such  instances,  the  participants  would  not  be  at  risk 
of  any  breach  of  confidentiality. 

If  the  study  is  not  deemed  to  be  exempt  from  IRB  review,  the  IRB  must 
determine  whether  the  protocol  needs  to  undergo  expedited  review  or  full 
review.  To  meet  the  requirements  for  expedited  review,  a  study  must  involve 
no  more  than  minimal  risk,  or  otherwise  fall  into  one  of  several  specific  cat¬ 
egories,  such  as  survey  research  or  research  on  nonsensitive  topics.  Minimal 
risk  is  defined  by  federal  regulations  as  the  fact  that  the  “probability  and 
magnitude  of  harm  or  discomfort  anticipated  in  the  research  are  no  greater 
in  and  of  themselves  from  those  ordinarily  encountered  in  daily  life  or  dur¬ 
ing  the  performance  of  routine  physical  or  psychological  examination  or 
tests”  (45  CFR  §  46.1 10 [b]).  Expedited  review  can  also  be  obtained  for  mi¬ 
nor  changes  in  previously  approved  research  protocols  during  the  period 
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(of  one  year  or  less)  for  which  the  original  protocol  was  authorized.  Expe¬ 
dited  reviews  can  be  handled  by  a  single  IRB  member  (often  the  chair)  and 
therefore  are  much  more  expeditious  (as  the  name  suggests). 

Protocols  that  do  not  meet  the  criteria  for  expedited  review  must  re¬ 
ceive  a  full  review  by  all  members  of  the  IRB.  Under full  review,  all  members 
of  the  IRB  receive  and  review  the  protocol,  consent,  and  any  additional 
materials  prior  to  their  scheduled  meeting.  Depending  on  the  particular 
IRB  and  the  number  of  protocols  that  they  normally  review,  an  IRB  may 
meet  anywhere  from  biweekly  to  quarterly.  Following  a  thorough  review 
and  discussion  of  issues  and  concerns  within  the  committee,  many  IRBs 
invite  the  researchers  in  to  answer  specific  questions  from  the  IRB  mem¬ 
bers.  Questions  may  address  any  or  all  aspects  of  the  research  procedures. 
After  all  of  the  IRB’s  questions  have  been  answered  and  the  researchers 
leave  the  room,  the  committee  votes  to  either  grant  approval  or  not.  In 
most  cases,  the  committee  will  vote  to  withhold  approval  pending  certain 
modifications  or  changes  to  the  protocol  or  the  consent  procedures.  Once 
the  modifications  are  made,  the  protocol  must  be  resubmitted.  If  the  IRB 
is  satisfied  that  the  necessary  modifications  were  made,  they  will  typically 
grant  approval  and  provide  the  researcher  with  a  copy  of  the  study  con¬ 
sent  form  bearing  the  IRB’s  stamped,  dated  approval.  Only  copies  of  this 
stamped  consent  form  may  be  used  to  obtain  informed  consent  from 
study  participants.  Although  IRB  approval  can  be  granted  for  one  full 
year,  certain  studies  (often  those  involving  a  less  clear  risk/benefit  ratio) 
may  receive  approval  for  6  months  or  less.  In  any  case,  researchers  must 
make  certain  to  keep  approvals  and  consent  forms  current.  If  the  study  is 
approved,  the  researcher  is  then  responsible  for  reporting  the  progress  of 
the  research  to  the  IRB  and/or  appropriate  institutional  officials  as  often 
as  (and  in  the  manner)  prescribed  by  the  IRB,  but  no  less  than  once  per 
year  (45  CFR  §  46.109[e]). 

DATA  SAFETY  MONITORING 

Concerns  about  respect,  beneficence,  and  justice  are  not  entirely  put  to 
rest  by  institutional  review  and  informed  consent.  Although  these  pro- 
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cesses  ensure  the  appropriateness  of  the  research  protocol  and  allow  po¬ 
tential  participants  to  make  autonomous  informed  decisions,  they  do  not 
provide  for  ongoing  oversight  that  may  be  necessary  to  maintain  the  safety 
and  ethical  protections  of  participants  as  they  proceed  through  the  re¬ 
search  experience.  To  accomplish  this  may  require  the  development  of  a 
data  safety  monitoring  plan  (DSMP). 

DSMPs  set  specific  guidelines  for  the  regular  monitoring  of  study  pro¬ 
cedures,  data  integrity,  and  adverse  events  or  reactions  to  certain  study 
procedures.  According  to  federal  regulations  (45  CFR  §  46.1 11  [a]  [6]), 
“[W]hen  appropriate,  the  research  plan  makes  adequate  provision  for 
monitoring  the  data  collected  to  ensure  the  safety  of  subjects.”  The  NIH, 
along  with  other  public  and  private  agencies,  have  developed  specific  cri¬ 
teria  for  their  DSMPs.  For  example,  for  Phase  I  and  Phase  II  NIH  clinical 
trials  (NIH,  1998),  researchers  are  required  to  provide  a  DSMP  as  part  of 
their  grant  applications.  DSMPs  are  then  reviewed  by  the  scientific  review 
groups,  who  provide  the  researchers  with  feedback.  Subsequently,  re¬ 
searchers  are  required  to  submit  more  detailed  monitoring  plans  as  part  of 
their  protocols  when  they  apply  for  IRB  approval. 

In  addition  to  the  DSMP,  researchers  may  be  required  by  their  funding 
agencies  or  IRBs  to  establish  a  data  safety  monitoring  board  (DSMB).  The 
DSMB  serves  as  an  external  oversight  committee  charged  with  protecting 
the  safety  of  participants  and  ensuring  the  integrity  of  the  study.  The 
DSMBs,  which  must  be  very  familiar  with  the  research  protocols,  are 
responsible  for  periodically  reviewing  outcome  data  to  determine  whether 
participants  in  one  condition  or  another  are  facing  undue  harm  as  a  result 
of  certain  experimental  interventions.  The  DSMBs  may  also  monitor 
study  procedures  such  as  enrollment,  completion  of  forms,  record  keep¬ 
ing,  data  integrity,  and  the  researchers’  adherence  to  the  study  protocol. 
Based  on  these  data,  the  DSMB  can  make  specific  recommendations  re¬ 
garding  appropriate  modifications.  In  trials  that  are  conducted  across  sev¬ 
eral  programs  or  agencies  (i.e.,  multicenter  trials),  DSMBs  may  act  as  over¬ 
arching  IRBs  that  are  responsible  for  the  ethical  oversight  of  the  entire 
project. 
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ADVERSE  AND  SERIOUS  ADVERSE  EVENTS 

Researchers  are  required  to  report  (to  the  governing  IRBs)  any  untoward 
or  adverse  events  involving  research  participants  during  the  course  of 
their  research  involvement.  Although  the  specific  reporting  requirements 
differ  by  IRB  and  funding  source,  the  definitions  of  adverse  events  (origi¬ 
nating  in  the  FDA’s  definitions  of  adverse  events  in  medical  trials)  are  gen¬ 
erally  the  same. 

An  adverse  event  (AE)  is  defined  as  any  untoward  medical  problem  that 
occurs  during  a  treatment  or  intervention,  whether  it  is  deemed  to  be  re¬ 
lated  to  the  intervention  or  not.  A  serious  adverse  event  (SAE)  is  defined  as 
any  occurrence  that  results  in  death;  is  life-threatening;  requires  inpatient 
hospitalization  or  prolongation  of  existing  hospitalization;  or  creates  per¬ 
sistent  or  significant  disability/incapacity,  or  a  congenital  anomaly/birth 
defects. 


SUMMARY 

This  chapter  was  intended  to  provide  a  general  history  and  overview  of 
some  of  the  central  ethical  issues  relating  to  the  conduct  of  scientific  re¬ 
search.  Unfortunately,  comprehensive  coverage  of  many  specific  research 
ethics  (e.g.,  publication  credit,  reporting  research  results,  plagiarism)  was 
beyond  the  scope  of  this  chapter.  Therefore,  we  strongly  recommend  that 
readers  refer  to  specific  ethical  codes  and  federal,  local,  and  institutional 
regulations  when  planning  and  engaging  in  research. 

The  many  revelations  of  human  rights  violations  and  atrocities  in  the 
name  of  scientific  research  have  led  to  a  heightened  public  awareness 
about  the  need  for  regulations  to  protect  the  rights  of  human  research 
participants.  In  response  to  this  heightened  awareness  and  call  for  protec¬ 
tions,  the  federal  government  has  established  an  extensive  system  of  reg¬ 
ulations  and  guiding  principles  to  promote  respect  for  persons,  benefi¬ 
cence,  and  justice  in  research  with  human  participants.  These  regulations 
have  helped  to  delineate  the  specific  types  of  information  that  must  be 
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conveyed  to  potential  research  participants  in  an  effort  to  ensure  that  con¬ 
sent  to  research  is  voluntary,  knowing,  and  intelligent.  In  addition,  these 
regulations  have  generated  mandatory  ethical  oversight  of  research  stud¬ 
ies.  Despite  these  many  developments,  there  is  still  a  need  for  further  re¬ 
search  in  the  area  of  ethical  protections  in  research  studies.  If  anything  has 
been  learned  in  the  years  since  Nuremberg  and  Tuskegee,  it  is  that  we  must 
continue  to  be  vigilant  in  protecting  the  rights  and  interests  of  our  human 
research  participants. 


..4S*  TEST  YOURSELF 


1 .  The  three  principles  set  forth  by  the  Belmont  Report  are  ( I )  respect  for 

persons,  (2)  beneficence,  and  (3) _ . 

2.  Beneficence  has  its  origins  in  the  famous  edict  of  the  Hippocratic  oath, 

which  states, “First,  do  no _ . 

3.  In  most  cases,  before  an  individual  can  participate  in  any  research  study, 

he  or  she  must  provide _ . 

4.  Before  any  study  can  take  place,  it  must  first  be  approved  by  an _ 

5.  The  three  basic  elements  of  informed  consent  are  that  it  must  be  (I)  com¬ 
petent,  (2)  knowing,  and  (3) _ . 

Answers:  I  .justice;  2.  harm;  3.  informed  consent;  4.  institutional  review  board  (or  human  sub¬ 
jects  committee);  5.  voluntary 
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DISSEMINATING  RESEARCH  RESULTS 
AND  DISTILLING  PRINCIPLES  OF 
RESEARCH  DESIGN  AND  METHODOLOGY 

At  this  point  in  the  book,  you  should  have  a  fairly  good  conceptu¬ 
alization  of  the  major  considerations  that  are  involved  in  con¬ 
ducting  a  research  study.  In  the  preceding  chapters,  we  have  cov¬ 
ered  each  step  in  the  process  of  conducting  research,  from  the  earliest 
stages — choosing  a  research  idea,  articulating  hypotheses,  and  selecting  an 
appropriate  research  design — to  the  final  stages — analyzing  the  data  and 
drawing  valid  conclusions.  Along  the  way,  we  have  also  discussed  several 
important  research-related  considerations,  including  several  types  of  va¬ 
lidity,  methods  of  controlling  artifact  and  bias,  and  the  ethical  issues  in¬ 
volved  in  conducting  research.  Although  you  may  not  feel  like  an  expert  in 
research  yet,  you  should  take  comfort  in  knowing  that  the  concepts  and 
strategies  that  you  learned  from  this  book  will  provide  you  with  a  solid 
foundation  of  research-related  knowledge.  As  you  gain  additional  re¬ 
search  experience,  these  concepts  and  strategies  will  become  second  na¬ 
ture.  We  have  certainly  covered  a  good  deal  of  information  in  this  book, 
but  we  are  not  quite  finished  yet. 

In  this  concluding  chapter,  we  will  discuss  what  is  often  considered  the 
final  step  of  conducting  a  research  study:  disseminating  the  results  of  the 
research.  As  will  be  discussed,  there  are  numerous  options  available  for 
those  researchers  who  desire  to  share  the  results  of  their  studies  with  oth¬ 
ers.  From  books  to  journals  to  the  Internet,  today’s  society  offers  many  ef¬ 
fective  and  efficient  outlets  for  the  dissemination  of  research  study  results. 
After  discussing  the  dissemination  of  research  results,  the  final  part  of  this 
chapter  will  present  a  distillation  of  the  major  principles  of  research  design 
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and  methodology.  Finally,  to  assist  the  reader  in  designing  a  sound  re¬ 
search  design,  this  chapter  will  include  a  checklist  of  the  major  research- 
related  concepts  and  considerations  we  have  covered  in  this  book. 

DISSEMINATING  THE  RESULTS  OF  RESEARCH  STUDIES 

This  book  would  certainly  be  incomplete  if  we  did  not  discuss  the  dissem¬ 
ination  of  research  results.  This  is  an  important  topic  that  is  occasionally 
overlooked  in  research  design  and  methodology  textbooks.  As  we  will  see, 
the  dissemination  of  research  study  results  plays  a  vital  role  in  the  ad¬ 
vancement  of  science  and,  consequentiy,  in  the  way  we  all  live. 

If  you  recall,  at  the  beginning  of  this  book,  we  discussed  the  role  that  re¬ 
search  plays  in  science.  Specifically,  we  stated  that  research  is  the  primary 
vehicle  by  which  science  advances.  Among  other  things,  research  has  the 
capacity  to  answer  questions,  solve  problems,  and  describe  things,  all  of 
which  may  lead  to  an  improvement  in  the  way  we  live.  But  here  is  the 
essential  point  to  remember:  For  a  research  study  to  change  the  way  we 
live,  or  to  have  any  effect  at  all,  the  researcher  must  share  the  results 
of  the  research  with  other  people  in  the  scientific  community.  Then,  in 
turn,  the  information  gleaned  from  the  research  study — regardless  of 
whether  it  relates  to  technology,  medicine,  economics,  or  any  other  field 
of  study — must  ultimately  be  shared  with  the  general  public  in  one  form 
or  another. 

We  would  all  likely  agree  that  it  would  certainly  do  little  good  if  a  re¬ 
searcher  who  discovered  something  important  decided  to  keep  those  re¬ 
sults  quiet.  Can  you  imagine  how  different  the  world  would  be  if  Thomas 
Edison  had  invented  the  light  bulb,  but  then  decided  not  to  tell  anyone 
about  his  invention?  What  if  Albert  Einstein  had  decided  not  to  share  his 
special  and  general  theories  of  relativity?  What  if  Bill  Gates  had  decided  to 
keep  his  computer  technology  all  to  himself?  What  if  Jonas  Salk  decided 
that  his  cure  for  polio  should  not  be  shared  with  other  people?  Clearly, 
then,  sharing  the  results  of  research  studies  is  important,  but  let’s  take  a 
closer  look  at  why  it  is  so  important.  After  discussing  the  importance  of 
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sharing  the  results  of  research  studies,  we  will  briefly  discuss  the  various 
oudets  that  are  available  to  researchers  who  decide  to  share  their  results. 


Sharing  the  Results  of  Research  Studies 

There  are  several  benefits  to  sharing  the  results  of  research  studies.  First, 
it  adds  to  the  knowledge  base  in  a  particular  scientific  field.  As  you  know, 
science  is  essentially  an  accumulation  of  knowledge,  and  sharing  research 
results  adds  an  incremental  amount  of  knowledge  to  what  is  already 
known  about  a  particular  topic.  Thus,  the  dissemination  of  research  results 
helps  to  advance  the  progress  of  science. 

Second,  sharing  the  results  of  research  ultimately  improves  the  overall 
quality  of  the  research  being  conducted.  For  example,  when  a  researcher 
seeks  to  publish  the  results  of  his  or  her  research  in  a  professional  journal, 
the  manuscript  describing  the  research  is  typically  reviewed  by  several  ed¬ 
itors  who  have  special  expertise  in  the  topic  area  of  the  research.  As  we  will 
discuss  in  the  next  section,  the  editors  evaluate  the  quality  of  the  study  and 
the  manuscript,  and  then  they  make  a  recommendation  regarding  whether 
the  manuscript  should  be  published  in  the  journal.  This  is  referred  to  as 
the  peer-review  process.  Presumably, 
only  the  most  well-conducted 
studies  and  well-written  manu¬ 
scripts  will  make  it  through  this 
peer-review  process  to  publica¬ 
tion.  As  a  result,  the  publication 
process  tends  to  weed  out  poorly 
conducted  studies,  which  has  the 
effect  of  improving  the  quality  of 
the  research  being  conducted.  In 
summary,  if  researchers  have  an 
eye  toward  eventually  publishing 
the  results  of  their  studies,  those 
researchers  will  need  to  ensure 


DOE-  *  T  FORGET 


Benefits  of  Sharing 
Research  Results 

1.  Adds  to  the  knowledge  base  in 
a  particular  scientific  field. 

2.  Improves  the  overall  quality  of 
research  being  conducted. 

3.  Allows  other  researchers  to 
replicate  a  study's  results  or  ex¬ 
tend  the  study's  findings. 

4.  Improves  the  way  we  live. 
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that  their  studies  are  well  designed  and  well  conducted.  We  will  talk  more 
about  the  publication  process  in  the  next  section. 

Third,  sharing  the  results  of  research  allows  other  researchers  to  evalu¬ 
ate  the  study’s  results  in  the  context  of  other  research  studies.  For  example, 
other  researchers  may  attempt  to  replicate  the  original  study’s  findings, 
which  we  already  established  is  an  important  component  of  scientific  re¬ 
search;  or  may  even  extend  the  original  study’s  findings  in  perhaps  unan¬ 
ticipated  ways.  In  either  case,  the  original  study’s  results  are  being  evalu¬ 
ated  by  other  researchers  in  other  contexts.  This  tends  to  function  as  a 
quality  check  on  the  original  research. 

Finally,  for  the  results  of  a  research  study  to  have  an  effect  on  the  way 
we  all  live,  those  results  need  to  be  shared  with  others.  This  is  the  point  we 
addressed  earlier  in  this  section.  To  refresh  your  memory,  we  established 
that  a  ground-breaking  research  study  would  do  little  good  if  the  re¬ 
searcher  decided  not  to  share  the  study’s  results  with  others.  In  fact,  some 
would  argue  that  the  true  test  of  a  research  study’s  value  lies  in  its  ability  to 
improve  some  facet  of  the  way  we  live.  For  that  improvement  to  take  place, 
a  study’s  results  need  to  be  shared  with  other  people.  For  example,  when 
Bill  Gates  developed  his  revolutionary  computer  technology,  that  tech¬ 
nology  had  to  be  shared  with  others  (e.g.,  scientists,  manufacturers,  dis¬ 
tributors,  marketing  firms),  and  then  that  technology  had  to  be  translated 
into  something  that  would  benefit  the  public  at  large — that  is,  personal 
computers  for  individual  sale. 

N  ow  that  we  have  addressed  the  importance  of  sharing  the  results  of  re¬ 
search  studies,  let’s  take  a  closer  look  at  the  various  options  that  are  avail¬ 
able  for  researchers  who  desire  to  disseminate  their  research  findings. 

Presentation  of  Research  Results 

One  option  available  to  those  researchers  who  decide  to  share  the  results 
of  their  research  is  to  present  their  findings  at  professional  conferences. 
Most  scientific  fields  have  guiding  professional  organizations  that  sponsor 
regularly  held  professional  conferences.  One  of  the  primary  functions  of 
these  conferences  is  to  serve  as  outlets  for  the  presentation  of  research  re- 
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suits  that  are  relevant  to  that  particular  scientific  field.  Because  profes¬ 
sional  conferences  are  held  so  frequently,  they  provide  for  the  dissemina¬ 
tion  of  up-to-date  research  findings.  By  contrast,  the  lag  time  between 
completing  a  research  study  and  the  eventual  publication  of  those  results 
in  a  professional  journal  is  typically  much  longer.  As  we  will  discuss  in  the 
next  section,  it  can  often  take  well  over  a  year  for  a  submitted  manuscript 
to  be  published  in  a  professional  journal.  By  that  time,  the  study’s  results 
may  have  been  expanded  upon,  refuted,  or  made  obsolete  by  other  stud¬ 
ies.  For  these  reasons,  professional  conferences  are  a  valuable  and  efficient 
oudet  for  research  results. 

Researchers  have  several  options  available  to  them  in  terms  of  present¬ 
ing  their  results  at  professional  conferences.  Although  the  format  for  pre¬ 
sentations  differs  from  conference  to  conference,  most  conferences  offer 
some  combination  of  the  following  presentation  formats:  poster  presen¬ 
tations,  oral  presentations,  and  symposiums.  A  poster  presentation,  as  the 
name  indicates,  involves  presenting  the  results  of  a  research  study  in  a 
poster  format.  At  many  conferences,  this  is  a  preferred  presentation  for¬ 
mat  for  students  and  beginning  researchers  (probably  because  there  are 
many  available  presentation  slots,  which  makes  it  less  competitive  than 
other  presentation  formats) .  An  oral  presentation  involves  speaking  about 
the  research  results  for  a  specified  amount  of  time  (sometimes  as  short  as 
10  minutes).  Finally,  a  symposium  is  a  collection  of  related  oral  presentations 
that  are  presented  as  a  group. 

Getting  to  present  the  results  of  a  research  study  at  a  professional  con¬ 
ference  is  a  competitive  process.  Typically,  researchers  submit  short  sum¬ 
maries  of  their  research  studies  to  the  conference  organizers  who,  in  turn, 
ask  reviewers  to  evaluate  the  research  and  determine  whether  the  study  is 
worthy  of  being  presented  at  the  conference.  If  accepted,  it  must  be  de¬ 
termined  whether  the  research  study  will  be  presented  as  a  poster  or  an 
oral  presentation.  At  most  conferences,  it  is  generally  considered  more 
prestigious  to  have  your  study  accepted  as  an  oral  presentation.  Often, 
short  summaries  of  the  research — abstracts — are  then  published  in  a  jour¬ 
nal  so  that  people  who  did  not  attend  the  conference  can  become  familiar 
with  the  results  of  the  studies. 
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Publication  of  Research  Results 

Publication  of  research  results  is,  by  far,  the  most  common  method  of  dis¬ 
seminating  the  results  of  a  research  study.  There  are  several  publication 
options,  including  books,  book  chapters,  monographs,  newsletters,  work¬ 
ing  reports,  technical  reports,  and  Internet-based  articles.  However,  pub¬ 
lication  in  a  peer-reviewed  professional  journal  is  generally  considered  the 
primary  and  most  valued  outlet  for  the  dissemination  of  research  results 
(see  Kazdin,  1992, 2003b).  Let’s  take  a  closer  look  at  publishing  a  research 
study’s  results  in  a  peer-reviewed  journal. 

Earlier  in  this  chapter,  we  briefly  discussed  the  peer-review  process, 
which  is  the  procedure  used  by  most  professional  journals  to  determine 
which  articles  should  be  published.  In  this  section,  we  will  add  a  few  com¬ 
ments  to  our  previous  discussion.  Once  a  researcher  completes  a  study, 
there  are  several  decisions  that  need  to  be  made  (see  Kazdin,  1992).  The 
first  is  whether  the  study’s  findings  merit  publication.  In  other  words,  the 
researcher  must  determine,  among  other  things,  whether  the  study  makes 
a  valuable  contribution  to  the  field.  If  the  researcher  decides  to  seek  pub¬ 
lication  of  the  study’s  findings,  he  or  she  must  then  determine  what  aspects 
of  the  study  should  be  published.  In  large  studies,  it  may  not  be  practical 
to  publish  the  entire  study  in  one  manuscript,  so  it  may  need  to  be  sub¬ 
divided  in  some  rational  manner.  For  example,  if  a  research  study  has  two 
distinct  parts,  the  researcher  may  decide  to  publish  each  part  of  the  study 


DON'T  FORGET 


Publishing  a  Study’s  Results  Begins  in  the  Planning  Stage 

It  is  important  to  note  that  decisions  made  in  the  planning  and  design 
stages  of  a  research  study  have  a  direct  effect  on  whether  that  study  will 
eventually  be  accepted  for  publication.  Many  of  the  decisions  made  in  the 
early  stages  of  a  study,  such  as  what  topic  to  study,  what  sample  to  use, 
and  which  research  design  to  implement,  play  an  important  role  in  deter¬ 
mining  the  overall  quality  and  impact  of  the  study,  which  are  two  impor¬ 
tant  considerations  in  whether  it  will  later  be  published. 
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— flap/d Reference  9/ 

Least  Publishable  Units 

Researchers  must  be  careful  to  avoid  breaking  up  a  study  into  something 
referred  to  as  least  publishable  units.  Although  it  is  certainly  desirable  to 
publish  the  results  of  a  research  study  most  researchers  agree  that  it  is 
not  advisable  to  pad  your  curriculum  vitae  with  more  publications  by 
breaking  up  a  study  into  the  largest  number  of  smallest  publishable  parts. 

A  study  should  be  divided  into  separate  manuscripts  only  if  the  division  is 
logically  supported  by  the  design  of  the  study. 

in  a  separate  manuscript  (but  see  Rapid  Reference  9.1  for  a  word  of  cau¬ 
tion  about  doing  this). 

Having  decided  to  publish  the  study,  the  researcher  must  then  decide  to 
which  journal  he  or  she  will  submit  a  manuscript  describing  the  study. 
There  may  literally  be  hundreds  of  journals  in  a  given  scientific  field,  and 
the  researcher  must  carefully  determine  which  journal  would  be  the  most 
appropriate  outlet  for  his  or  her  research.  It  is  important  to  note  that,  in 
some  fields  of  study,  researchers  can  submit  a  manuscript  to  only  one  jour¬ 
nal  at  a  time.  In  these  situations,  the  researcher  must  await  a  final  publica¬ 
tion  decision  from  the  journal  before  submitting  the  manuscript  to  an¬ 
other  journal  (if  necessary).  Given  that  it  can  take  several  months,  or 
perhaps  even  longer,  for  a  manuscript  to  be  reviewed  and  for  a  publication 
decision  to  be  made,  researchers  must  decide  carefully  where  they  will 
send  their  manuscripts.  If  time  is  of  the  essence,  as  it  often  is  with  research, 
choosing  an  appropriate  journal  is  an  extremely  important  decision. 

Once  a  researcher  decides  on  a  particular  journal,  he  or  she  must  pre¬ 
pare  the  manuscript  in  accordance  with  the  style  and  formatting  require¬ 
ments  of  the  journal.  Different  journals — and  even  different  fields  of 
study — have  different  formatting  and  style  requirements,  and  it  is  very 
important  that  researchers  strictly  adhere  to  those  specifications.  For 
example,  in  psychology  (and  related  disciplines),  the  style  and  format  of 
manuscripts  is  specified  by  the  APA  (2001).  The  final  manuscript  con¬ 
sists  of  several  different  sections  (see  Rapid  Reference  9.2)  that  describe 
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— fiap/d Reference  9.2 


Typical  Sections  of  a  Manuscript 

For  manuscripts  that  describe  empirical  studies,  the  following  sections  are 

typically  included: 

1.  Title 

2.  Abstract  (brief  summary  of  the  study) 

3.  Introduction  (rationale  and  objectives  for  the  study;  hypotheses) 

4.  Method  (description  of  research  design,  study  sample,  and  research 
procedures) 

5.  Results  (presentation  of  data,  statistical  analyses,  and  tests  of  hypothe¬ 
ses) 

6.  Discussion  (major  findings,  interpretations  of  data,  conclusions,  limita¬ 
tions  of  study,  and  areas  for  future  research) 


all  aspects  of  the  research  study,  including  the  rationale  for  the  study,  re¬ 
lated  research,  study  procedures,  statistical  analyses,  results,  and  implica¬ 
tions. 

After  the  manuscript  is  submitted  to  a  journal,  the  editor  of  the  journal 
sends  the  manuscript  to  several  reviewers  who  are  asked  to  review  the 
manuscript  and  make  a  publication  recommendation.  There  are  generally 
two  categories  of  reviewers  for  journals:  (1)  consulting  editors  (who  re¬ 
view  manuscripts  for  the  journal  on  a  regular  basis)  and  (2)  ad  hoc  editors 
(who  review  manuscripts  for  the  journal  less  frequently,  typically  on  an  as- 
needed  basis).  The  reviewers  are  usually  selected  because  of  their  knowl¬ 
edge  and  expertise  in  the  area  of  the  study  (Kazdin,  1992). 

The  reviewers  evaluate  each  research  study  in  terms  of  its  substance, 
methodology,  contribution  to  the  field,  and  other  considerations  relating 
to  the  overall  quality  of  the  research  study  and  the  accompanying  manu¬ 
script.  It  is  also  worth  noting  that,  depending  on  the  particular  field  of 
study,  the  editorial  reviews  may  be  either  anonymous  or  signed.  After  all 
of  the  reviewers  have  completed  their  reviews  and  submitted  their  written 
comments  to  the  journal  editor,  the  journal  editor  makes  a  final  publica- 
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tion  decision  based  on  his  or  her  evaluation  of  the  manuscript  and  the  re¬ 
viewers’  written  editorial  comments. 

Although  journals  differ  with  respect  to  how  they  handle  manuscript 
submissions,  most  journals  use  some  combination  of  the  following  publi¬ 
cation  decisions: 

1 .  Accepted:  The  manuscript  is  accepted  contingent  on  the  author’s 
making  revisions  specified  by  the  journal  reviewers.  Almost  no 
manuscript  is  accepted  for  publication  as  submitted  (i.e.,  with 
no  revisions),  and  some  accepted  manuscripts  may  require  sev¬ 
eral  founds  of  revisions  before  finally  being  published. 

2.  Rejected:  The  manuscript  is  rejected,  and  the  author  will  not  be 
invited  to  revise  and  resubmit  the  manuscript  for  further  publi¬ 
cation  consideration.  Manuscripts  can  be  rejected  for  many  dif¬ 
ferent  reasons,  including  design  flaws,  an  unimportant  topic, 
and  a  poorly  written  manuscript. 

3.  Rejected-resubmit:  The  manuscript  is  rejected,  but  the  author  is  in¬ 
vited  to  revise  and  resubmit  the  manuscript  for  future  publica¬ 
tion  consideration.  In  this  instance,  the  required  revisions  are 
typically  extensive,  and  there  is  no  guarantee  that  the  manuscript 
will  be  published,  even  if  all  of  the  specified  revisions  are  made. 

Most  researchers  would  likely  agree  that  going  through  the  peer-review 
publication  process  can  be  both  time  consuming  and  humbling.  Two  as¬ 
pects  of  this  process  can  be  particularly  difficult  to  handle  for  inexperi¬ 
enced  and  experienced  researchers  alike:  First,  the  peer-review  process  is 
often  excruciatingly  slow.  As  previously  noted,  once  a  manuscript  is  sub¬ 
mitted  to  a  journal,  it  can  take  several  months  for  a  publication  decision  to 
be  made.  If  extensive  revisions  are  required  as  a  condition  of  publication, 
then  it  can  take  significantly  longer  than  that.  Even  after  a  journal  decides 
to  publish  the  manuscript,  it  can  take  many  more  months — sometimes 
well  over  a  year — for  the  article  to  finally  be  published.  The  slow  pace  of 
the  peer-review  publication  process  is  often  a  source  of  frustration  for  re¬ 
searchers.  Moreover,  it  is  possible  for  research  results  to  become  stale,  or 
obsolete,  by  the  time  that  the  results  are  finally  published. 
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Second,  it  is  not  easy  to  have  your  research  evaluated,  criticized,  and 
(more  often  than  not)  rejected  by  journals.  After  putting  a  great  deal  of 
thought,  energy,  time,  and  money  into  a  research  study,  it  can  be  difficult 
to  handle  criticism  and  rejection.  Yet  rejection — and  lots  of  it — is  part  of 
the  business  of  conducting  research.  Some  of  the  more  prestigious  pro¬ 
fessional  journals  have  rejection  rates  of  over  90%,  which  means  that  they 
are  accepting  for  publication  approximately  1  manuscript  out  of  every  10 
that  are  submitted.  Even  seasoned  and  well-published  researchers  experi¬ 
ence  their  fair  share  of  rejection.  (At  this  point,  it  may  seem  that  we  should 
comfort  the  reader  by  indicating  that  the  rejection  aspect  of  publishing  be¬ 
comes  easier  over  time,  but  we’re  not  exactly  sure  that’s  true.)  Despite  the 
frustrations  associated  with  the  peer-review  process — in  fact,  perhaps  be¬ 
cause  of  the  frustrations  associated  with  the  peer-review  process — getting 
a  research  study  published  is  a  very  exciting  and  rewarding  accomplish¬ 
ment. 

PRINCIPLES  OF  RESEARCH  DESIGN  AND  METHODOLOGY 

To  assist  you  in  digesting  the  large  amount  of  material  presented  in  this 
book,  we  have  distilled  some  overarching  principles  of  research  method¬ 
ology  that  should  be  kept  in  mind  when  engaging  in  research.  The  follow¬ 
ing  principles  should  serve  as  helpful  guides  as  you  engage  in  the  process 
of  designing  and  conducting  a  research  study. 

KeepYour  Eyes  Open 

Perhaps  the  most  basic  lesson  to  guide  your  research  is  to  keep  your  eyes 
open.  As  we  discussed  in  Chapter  2,  many  ideas  for  research  studies  are 
discovered  simply  by  observation  of  the  environment  in  which  we  live.  It 
is  often  through  the  simple  act  of  observation  that  researchers  formulate 
their  research  ideas  and  choose  their  research  questions.  A  keen  eye  to 
your  surroundings  may  reveal  questions  that  need  to  be  answered,  prob¬ 
lems  that  need  to  be  solved,  things  that  need  to  be  improved,  or  phenom¬ 
ena  that  need  to  be  described,  all  of  which  can  be  accomplished  through 
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well-designed  and  well-conducted  research.  Therefore,  keeping  your  eyes 
open  is  often  the  first  step  in  the  research  process. 

Be  An  Empiricist 

The  hallmark  of  being  a  good  researcher  is  being  an  empiricist.  As  you  may 
recall  from  Chapter  1,  empiricists  rely  on  the  scientific  method  to  acquire 
new  knowledge.  The  scientific  method’s  heavy  emphasis  on  direct  and 
systematic  observation  and  hypothesis  testing  in  the  acquisition  of  new 
knowledge  effectively  distinguishes  science  from  pseudo-science  and 
nonscience.  Moreover,  to  be  able  to  draw  valid  conclusions  based  on  your 
research,  which  is  the  goal  of  all  research,  it  is  essential  that  you  adhere  to 
the  empirical  approach. 

Be  Creative 

Throughout  this  book,  we  have  emphasized  the  importance  of  using  an 
appropriate  research  design  and  sound  methodology.  As  you  know,  en¬ 
gaging  in  well-designed  research  studies  is  the  only  way  of  ensuring  that  re¬ 
searchers  can  draw  valid  conclusions  based  on  the  results  of  their  studies. 
Clearly,  then,  basing  your  research  design  and  methodology  on  accepted 
scientific  principles  is  an  important  consideration. 

It  is  also  important,  however,  to  be  creative  when  conducting  research. 
Creativity  is  particularly  important  in  generating  new  research  ideas,  com¬ 
ing  up  with  appropriate  and  perhaps  novel  research  designs,  and  thinking 
about  the  implications  of  your  research  studies.  Thinking  outside  the  box 
has  led  to  many  great  scientific  discoveries.  Good  research  is  often  as  much 
art  as  it  is  science,  so  being  creative  is  an  important  asset  to  the  process. 

Research  Begets  Research 

This  principle  emphasizes  the  importance  of  following  a  logical  progres¬ 
sion  when  conducting  research.  In  other  words,  to  have  a  coherent  body  of 
research,  each  research  study  should  be  the  next  logical  step  in  the  overall 
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line  of  research.  As  we  have  repeatedly  noted  throughout  this  book,  science 
advances  in  small  increments  through  well-conducted  research  studies. 
Therefore,  it  is  important  that  research  studies  answer  discrete  questions 
that  flow  logically  from  prior  research  studies.  Following  this  logical  pro¬ 
gression  of  research  ensures  that  research  studies,  and  the  findings  gleaned 
from  them,  are  based  on  a  solid  theoretical  and  empirical  foundation. 

Adhere  to  Ethical  Principles 

The  importance  of  adhering  to  applicable  ethical  principles  was  discussed 
in  detail  in  Chapter  8,  but  it  cannot  be  overemphasized.  The  rights  of  study 
participants  are  of  paramount  importance  in  the  research  context,  and 
protecting  those  rights  takes  precedence  over  all  other  research-related 
considerations.  Violating  applicable  ethical  guidelines  may  hurt  the  study 
participants,  the  reputation  of  the  researchers  who  conducted  the  study, 
and,  in  some  ways,  the  entire  field  of  scientific  research.  Thus,  researchers 
have  an  obligation  to  be  aware  of  the  ethical  guidelines  that  govern  the  re¬ 
search  that  they  are  conducting. 

Have  Fun 

This  almost  seems  axiomatic,  but  we’ll  state  it  anyway.  Try  to  have  fun  while 
conducting  research.  Conducting  research  can  certainly  be  an  arduous  en¬ 
deavor,  but  it  is  important  to  have  fun.  As  with  anything  else,  if  you  are  hav¬ 
ing  fun  while  you  do  it,  you  will  be  more  likely  to  become  engaged  in  the 
process.  Research  can  be  exciting,  so  take  pride  in  being  part  of  something 
that  will  advance  science  and  potentially  improve  the  way  we  all  live. 

CHECKLIST  OF  RESEARCH-RELATED  CONCEPTS 
AND  CONSIDERATIONS 

We  have  finally  reached  the  concluding  section  of  this  book.  In  this  sec¬ 
tion,  we  will  present  a  convenient  checklist  of  the  major  research-related 
concepts  and  considerations  that  we  have  covered.  Although  the  follow - 
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ing  checklist  could  not  possibly  contain  every  conceivable  consideration 
that  researchers  must  take  into  account,  it  should  serve  to  alert  researchers 
to  the  major  considerations  that  must  be  kept  in  mind  when  designing  and 
conducting  a  research  study. 

1 .  Follow  the  scientific  method.  The  scientific  method  is  what  sepa¬ 
rates  science  from  nonscience.  The  scientific  method,  with  its 
emphasis  on  observable  results,  assists  researchers  in  reaching 
valid  and  scientifically  defensible  conclusions. 

2.  Keep  the  goals  of  scientific  research  in  mind.  The  goals  of  scientific 
research  are  to  describe,  predict,  and  understand  or  explain. 
Keeping  these  goals  in  mind  will  assist  you  in  achieving  the 
broad  goals  of  science — that  is,  answering  questions  and  ac¬ 
quiring  new  knowledge. 

3.  Choose  a  research  topic  carefully.  There  are  two  considerations  with 
respect  to  choosing  a  research  topic.  First,  a  research  question 
must  be  answerable  using  available  scientific  methods.  If  a 
question  cannot  be  answered,  then  it  cannot  be  investigated 
using  science.  Second,  it  is  important  to  make  sure  that  the 
question  you  are  asking  has  not  already  been  definitively  an¬ 
swered;  this  emphasizes  the  importance  of  conducting  a  thor¬ 
ough  literature  review. 

4.  Use  operational  definitions.  Operational  definitions  clarify  exactly 
what  is  being  studied  in  the  context  of  a  particular  research 
study.  Among  other  things,  this  reduces  confusion  and  permits 
replication  of  the  results. 

5.  A-rticuIate  hypotheses  that  are falsifiable  and predictive.  As  you  may  re¬ 
call,  each  hypothesis  must  be  capable  of  being  refuted  based 
on  the  results  of  the  study.  Furthermore,  a  hypothesis  must 
make  a  prediction,  which  is  subsequently  tested  empirically  by 
gathering  and  analyzing  data. 

6.  Choose  variables  based  on  the  research  question  and  hypotheses.  The 
variables  selected  for  a  particular  study  should  stem  logically 
from  the  research  question  and  the  hypotheses. 
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7.  Use  random  selection  whenever  possible.  Use  random  selection  when 
choosing  a  sample  of  research  participants  from  the  popula¬ 
tion  of  interest.  This  helps  to  ensure  that  the  sample  is  repre¬ 
sentative  of  the  population  from  which  it  was  drawn. 

8.  Use  random  assignment  whenever  possible.  Use  random  assignment 
when  assigning  participants  to  groups  within  a  study.  Random 
assignment  is  a  reliable  procedure  for  producing  equivalent 
groups  because  it  evenly  distributes  characteristics  of  the 
sample  among  all  of  the  groups  within  the  study.  This  helps  the 
researcher  isolate  the  effects  of  the  independent  variable  by  en¬ 
suring  that  nuisance  variables  do  not  interfere  with  the  inter¬ 
pretation  of  the  study’s  results. 

9.  Be  aware  of  multicultural  considerations.  Be  cognizant  of  the  effects 
that  cultural  differences  may  have  on  the  research  question  and 
design.  For  certain  types  of  research,  such  as  treatment-based 
research,  it  is  important  to  determine  whether  the  intervention 
being  studied  has  similar  effects  on  both  genders  and  on  di¬ 
verse  racial  and  ethnic  groups. 

10.  Eliminate  sources  of  artifact  and  bias.  To  the  extent  possible,  elimi¬ 
nate  sources  of  artifact  and  bias  so  that  more  confidence  can 
be  placed  in  the  results  of  the  study.  The  effects  of  most  types 
of  artifact  and  bias  can  be  eliminated  (or  at  least  considerably 
reduced)  by  employing  random  selection  when  choosing  re¬ 
search  participants  and  random  assignment  when  assigning 
those  participants  to  groups  within  the  study. 

1 1 .  Choose  reliable  and  valid  measurement  strategies.  When  selecting 
measurement  strategies,  let  validity  and  reliability  be  your 
guides.  Measurement  strategies  should  measure  what  they  pur¬ 
port  to  measure,  and  should  do  so  in  a  consistent  fashion. 

1 2.  Use  rigorous  experimental  designs.  Whenever  possible,  researchers 
should  use  a  true  experimental  design.  Only  a  true  experimental 
design,  one  involving  random  assignment  to  experimental  and 
control  groups,  permits  researchers  to  draw  valid  causal  infer- 
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ences  about  the  relationship  between  variables.  Because  it  may 
not  always  be  possible  or  feasible  to  use  a  true  experimental  de¬ 
sign,  a  good  rule  of  thumb  is  that  researchers  should  strive  to 
use  the  most  rigorous  design  possible  in  each  situation. 

13.  Attempt  to  increase  the  validity  of  a  study.  A  well-conducted  research 
study  will  have  strong  internal  validity,  external  validity,  con¬ 
struct  validity,  and  statistical  validity.  This  maximizes  the  likeli¬ 
hood  of  drawing  valid  inferences  from  the  study. 

14.  Use  care  in  analysing  and  interpreting  the  data.  A  crucial  aspect  of 
research  studies  is  preparing  the  data  for  analysis,  analyzing  the 
data,  and  interpreting  the  data.  The  proper  analysis  of  a  study’s 
data  enhances  the  ability  of  researchers  to  draw  valid  infer¬ 
ences  from  the  study. 

1 5 .  Become  familiar  with  commonly  encountered  ethical  considerations. 
Researchers  have  an  obligation  to  avoid  violating  ethical  stan¬ 
dards  when  conducting  research.  This  means  that  researchers 
must  be  familiar  with,  among  other  things,  the  rights  of  study 
participants. 

16.  Disseminate  the  results  of  research  studies.  Science  advances  through 
the  dissemination  of  research  findings,  so  researchers  should 
attempt  to  share  the  results  of  their  research  with  the  scientific 
community. 


SUMMARY 

We  have  covered  quite  a  bit  of  research-related  information  in  this  book, 
and  we  hope  that  you  have  learned  a  great  deal  about  the  process  and  im¬ 
portance  of  conducting  well-designed  research  studies.  We  are  confident 
that  the  material  covered  in  this  book  will  serve  you  well  in  your  research 
endeavors,  and  we  believe  that  this  book  will  provide  you  with  a  solid 
foundation  of  research-related  knowledge  and  skills.  As  you  continue  to 
develop  as  a  researcher,  we  hope  that  the  lessons  learned  from  this  book 
will  remain  in  the  forefront  of  your  mind. 
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..4S*  TEST  YOURSELF  ^ 


1 .  The  final  step  in  a  research  study  is _ the  results  of  the  study. 

2.  The _ - _ process  is  used  by  journals  to  determine  which 

manuscripts  should  be  accepted  for  publication. 

3.  Presentations  and  publications  are  two  options  available  to  researchers 
who  desire  to  share  the  results  of  their  studies.  True  or  False? 

4.  What  are  the  three  possible  editorial  decisions  following  the  peer  review 
of  a  manuscript? 

5.  A _ is  a  collection  of  related  oral  presentations  that  are  pre¬ 

sented  as  a  group  at  a  professional  conference. 

Answers:  I .  disseminating  (or  sharing  or  publishing);  2.  peer-review;  3.True;  4.  Accepted,  re¬ 
jected,  rejected-resubmit;  5.  symposium 
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consent  form  and,  developing  a,  252—253 
protocol  submission  overview,  254,  255 
Instruments  /instrumentation: 
commercially  available,  108 
effects,  165 

measurement  strategies  for  data  collection 
and  appropriateness  of,  112—113 
multicultural  issues  and,  61 
new  or  unique,  114 

threats  to  internal  validity  and,  163—165, 

175 

Interaction  effects,  133-134 
external  validity,  190, 191 
Interest,  28-29 

construct  of,  measurement  strategies  and, 
115 

See  also  Population,  of  interest 
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Interference,  multiple-treatment,  181—182, 189 
Internal  validity,  66,  67, 158-160 
artifact  and  bias,  controlling,  81 
randomization  and,  81 
threats  to,  160-174, 175 
Interpretation.  See  Data,  interpretation 
Interrater  reliability,  105 
checks,  165 
Interval  scales,  99-100 

distinguishing  characteristics  of,  100 
Interviews,  1 1 7—1 1 8 
Intrinsic  changes.  See  Maturation 
Intrinsic  vulnerabilities,  informed  consent  and, 
246 

Invariants,  150—151 

Inverse  correlation.  ^Negative  correlation 
Inverse  relationship,  216 
Inverse  transformation,  207 

Justice,  238, 243-245 

Knowledge,  controlling  experimenter  bias  and, 
72, 74-75 

Last  value  carried  forward,  205 
Length,  test,  108 
Level,  change  in,  140—141 
Lexis,  32 

Likert  scales,  152—153 
Linear  regression,  224 
Literature  review,  32—34 
Logarithm,  207 
Logging,  199—201 
Logistic  regression,  224 
Log  transformation,  207 
Longitudinal  designs,  143 

MacArthur  Competence  Assessment  Tool  for 
Clinical  Research,  247 
Magnitude,  98—99 
Manual,  well-documented,  108 
Manuscript,  typical  sections  of,  267, 268 
Matching,  88-90 

block  randomization  and,  126 
Maturation,  threats  to  internal  validity  and, 
162-163,175 

Mean,  92, 212.  See  also  Group  means;  Predicted 
mean  imputation 


Measurement,  95-97 
error,  103—104 

strategies  for  minimizing,  104 
importance  of,  96 
obtrusive  vs.  unobtrusive,  1 86 
psychometric  considerations,  101—111 
scales  of,  97-101 

strategies,  commercially  available  instru¬ 
ments  and,  108 

unreliability  of,  threats  to  statistical  validity 
and,  195, 196 

See  also  Associations,  measures  of 
Median,  213 
Medical  research,  237 
Medline,  32 

Mental  Measurements  Yearbook  and  Tests  in 
Print,  108, 113 
Methodology: 
defined,  22 

principles  of,  270—272 
Metric  data,  measurement  and,  97 
Minimal  risk,  256 

Moderator,  focus  groups  and  trained,  1 55 
Mortality,  randomized  two-group  design  and, 
130 

Motivation,  choosing  a  research  topic  and,  29 
Multicultural  issues,  60-63 
competence  and,  60—61 
Multiple  analysis  of  variance  (MAN OVA),  223 
Multiple  regression,  224 
Multiple  statistical  comparisons.  See  Compar¬ 
isons,  multiple 

Multiple  time-series  design.  See  Time-series 
design,  multiple 

National  Bioethics  Advisory  Commission 
(NBAC),  239-240 

National  Commission  for  the  Protection  of 
Human  Subjects  of  Biomedical  and  Be¬ 
havioral  Research,  237 
National  Institutes  of  Health  (NIH),  62 

Guidelines  on  the  Inclusion  of  Women  and  Minori¬ 
ties  as  Subjects  in  Clinical  Research ,  62—63 
Revitalization  Act  of  1993,  62 
National  Research  Act,  237 
National  Science  and  Technology  Council,  240 
Naturalistic  observation  studies.  See  Observa¬ 
tions,  naturalistic 
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Negative  correlation,  19 
Negative  relationship,  216 
Nominal  scales,  97-98 

distinguishing  characteristics  of,  97 
measurement  strategies  for  data  collection 
and,  112 

Nomothetic  approach,  17 
Nondirectional  hypotheses,  39—41 
Nonequivalent  comparison-group  designs,  138 
posttest  only,  138-139 
pretest-posttest,  139 

Nonexperimental  designs.  See  Qualitative  de¬ 
signs 

Noninterference,  naturalistic  observation  and, 
150 

Nonmetric  data,  measurement  and,  97 

Nonsignificance,  231—232 

Normal  distribution.  See  Distribution,  normal 

Norms,  test  evaluation  and,  108 

Novelty  effects,  183-185, 189 

Nuisance  variables.  See  Variables,  nuisance 

Null  hypotheses,  9—10 

articulating  hypotheses  and,  38—39 
rejecting 

analyses  and,  11, 12 
conclusions  and,  14 
statistical  validity  and,  193 
See  also  Hypotheses 
Nuremburg  Code,  235—237 

Observations,  5,  6—7, 117, 119 
naturalistic,  149-151 

Obtrusive  measurement.  See  Measurement,  ob¬ 
trusive  vs.  unobtrusive 

Office  for  Protection  From  Research  Risks,  63 
Operational  definitions,  7 

formulating  research  questions  and,  35—37 
measurement  and,  96 
psychometric  considerations,  101—102 
Oral  presentation,  265 
Ordinal  scales,  98-99 

distinguishing  characteristics  of,  98 
measurement  strategies  for  data  collection 
and,  112 
Outliers,  167 

Parametric  tests,  227 
Parsimony,  115 


Partial-blind  technique,  controlling  experi¬ 
menter  bias,  75,  76 
Partial  correlation,  92,  93 
Participant  effects,  76-79 
controlling,  79—81 
Participants,  50—51,  256 
assigning  groups,  55—60 
multiculturalism  and,  62—63 
selecting  study,  51—55 
Pearson  218 
Peer-review  process,  263 

Personal  characteristics,  informed  consent  and, 
246 

Phi  correlation,  219 

Plausible  hypotheses,  67.  See  also  Hypotheses 
Point-biserial  correlation,  218—219 
Population,  18 
of  interest,  82—83 
Positive  correlation,  1 9 
Positive  relationship,  216 
Poster  presentation,  265 
Practice  effect,  166-167 
Pre/post  design,  45 
Predicted  mean  imputation,  205 
Predictions,  8, 19, 20 

articulating  hypotheses  and,  37—38 
theories  and,  31 
Predictive  validity,  1 09-1 1 0 
Present,  42—43 
Previous  research,  30 
Problem,  formulating  the  research,  34—37 
Problem  solving,  29—30 
PsychINFO,  32,  33 
PsychLIT,  32 
Publication  bias,  231 
Publishing  results,  266-270 
least  publishable  units  and,  267 
P- value,  218 
Pygmalion  effect,  69 

Qualitative  data,  measurement  and,  97 
Qualitative  designs,  147-156 
Qualitative  research,  1 7 
Qualitative  variables.  See  Variables,  qualitative 
Quality,  research  idea  and,  31—32 
Quality  control  procedures,  controlling  experi¬ 
menter  bias  and,  72,  73 
Quantitative  data,  measurement  and,  97 
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Quantitative  research,  1 7 

Quantitative  variables.  See  Variables,  quantitative 
Quasi-experimental  designs,  85, 137-147 
Questionnaire: 

closed-ended,  152, 153 
open-ended,  152, 153 
survey  studies  and,  1 52 
Questions,  5,  7—8 

measurement  strategies  for  data  collection 
and,  111-112 

Random  assignment,  56—57 

artifact  and  bias,  controlling,  68,  82,  85—88 
See  also  Randomization 
Randomization: 

achieving  control  through,  81—93 
artifact  and  bias,  controlling,  68 
block,  125—126 
causality  and,  20 
checks,  129 

logistical  difficulty  of,  137 
See  also  Experimental  designs 
Random  numbers  table,  1 24—1 25 
Random  selection,  54—55,  56 

artifact  and  bias,  controlling,  68,  82—85,  86, 

88 

See  also  Randomization 
Range,  214 
Ratio  scales,  100—101 

distinguishing  characteristics  of,  100 
Reactivity: 

assessment  and,  185—186, 189 
experimental  arrangements  and,  180—181, 
189 

Reading  level,  test  evaluation  and,  108 
Record-keeping  responsibilities,  200 
Recruitment  log,  1 99—200 
Regression,  224—225 

Relational  vulnerabilities,  informed  consent 
and,  246—247 
Reliability: 

experiments  and,  10 
increasing,  strategies  for,  104 
instrumentation  and  threats  to  internal  valid¬ 
ity,  163-164 

measurement  and,  102—106 

strategies  for  data  collection  and,  112 
test  evaluation  and,  108 
See  also  specific  type 


Replication,  5, 14—16 

operational  definitions  and,  36 
previous  research  and,  30 
Research,  definition  of,  46 
Researchers,  multiculturalism  and,  60—62 
Respect  for  persons,  238,  240—241 
Response  set,  206 
Results: 

misperceptions  and,  2 
presentation  of,  264—265 
publication  of,  266—270 
reporting,  261, 262—270 
popular  media  and,  2 
sharing  the,  263—264 
survey  studies  and,  1 53 
Reversal  time-series  design.  See  Time-series 
design,  reversal 
Roles: 

multiple,  controlling  experimenter  bias,  72, 
73 

participant  effects  and,  78-79 
Rosenthal  effect,  69 

Sample,  1 8,  54 

characteristics,  threats  to  external  validity 
and,  178-180, 189 

extraneous  variables,  controlling,  91—92 
survey  studies  and,  152 
Sample  of  convenience,  83—84 
Scientific  method,  4—16 
Screening.  See  Data,  screening 
Selection  biases,  threats  to  internal  validity  and, 
169-170,175 

Sensitization,  pretest  and  posttest,  186—187, 
189 

Serious  adverse  event  (SAE),  259 
Settings,  threats  to  external  validity  and,  1 80, 
189 

Significant  difference,  1 1 
Simple  interrupted  time-series  design.  See 
Time-series  design,  simple  interrupted 
Simple  regression,  224 

Situational  factors,  informed  consent  and,  246 
Slope,  change  in,  140—141 
Solomon  four-group  design,  132—133 
interaction  effects  and,  1 34 
Spearman  rank-order  correlation,  219 
Split-half  reliability,  105 
Square  root  transformation,  207 
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Standard  deviation,  92, 215—216 
Standardization: 

experimenter  bias  and,  controlling,  72,  73 
instrumentation  and  threats  to  internal  valid¬ 
ity,  163-164 

Standardized  administration  procedure,  test 
evaluation  and,  108 

Statistical  approaches,  controlling  extraneous 
variables,  92-93 
Statistical  conclusion  validity,  85 
Statistical  consultants,  controlling  experimenter 
bias  and,  72,  74 
Statistical  controls,  68 

Statistical  evaluation,  statistical  validity  and,  193 
Statistically  significant  difference,  45 
Statistically  significant  effect,  14 
Statistical  power,  137 

data  interpretation  and,  225 
low,  194, 196 

Statistical  regression,  threats  to  internal  validity 
and, 167-168, 175 
Statistical  significance,  218, 229 
Statistical  validity,  66,  67,  85, 158, 192-194 
threats  to,  194-196 

Stimulus  characteristics,  threats  to  external  va¬ 
lidity  and,  1 80, 1 89 
Survey  studies,  151, 153-154 
nine  steps  for,  1 52-1 53 
Symposium,  265 

Syphilis.  See  Tuskegee,  syphilis  study  at 

Tabulation,  153 
Temporal  precedence,  1 44 
Temporal  validity,  176 
Testing,  116, 117 

threats  to  internal  validity  and,  165-167, 175 
Test-retest  reliability,  105, 106 
Theoretical  soundness,  test  evaluation  and,  108 
Theory,  30-32 

Therapeutic  misconception,  249 
Time,  measurement  strategies  and,  115 
Time-order  relationship,  21 
Time-series  design,  139—140 
multiple,  143 
reversal,  142 

simple  interrupted,  141—142 
Timing: 

of  assessment  and  measurement,  1 87—1 88, 1 89 
test  evaluation  and,  108 


Topic,  choosing  a  research,  28—32 

Tracking,  199—201 

Training: 

controlling  experimenter  bias  and,  72—73 
measurement  strategies  for  data  collection 
and,  114—115 
Treatment: 

imitation  of,  171—173, 175 
medical  research  vs.  medical,  237 
special,  173-174, 175 
See  also  Interference,  multiple-treatment 
Trimodal  distribution.  See  Distribution,  trimodal 
True  experiments,  85 
True  score,  103 
T-test,  220-221 

controlling  extraneous  variables,  92,  93 
omnibus,  220 

Tuskegee,  syphilis  study  at,  235,  237 
Two-group  design,  89 
randomized,  127—128 
posttest  only,  128 
pretest-posttest,  128—132 
Type  I  errors,  1 1—14 

data  transformation  and,  207 
statistical  power  and,  226 
Type  II  errors,  11—14 

data  transformation  and,  207 
statistical  power  and,  226 

U.S.  Department  of  Health  and  Human  Ser¬ 
vices,  63, 233, 239 

U.S.  Food  and  Drug  Administration  (FDA), 
239,  245 

Unobtrusive  measurement.  ^Measurement, 
obtrusive  vs.  unobtrusive 

Validity,  23, 158, 196-197 
artifact  and  bias,  66 

experimental  designs  and  threats  to,  136 — 

137 

instrumentation  and  threats  to  internal  valid¬ 
ity,  163-164 

measurement  and,  106—1 1 1 

strategies  for  data  collection  and,  112 
test  evaluation  and,  108 
See  also  specific  type 

Values,  identifying  and  coding  missing,  204, 

205,  206 

Variability,  194-195, 196 
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Variables: 

categorical,  47-49 
choosing,  41—50 

computing  totals  and  new,  204,  205,  206, 
207 

continuous,  47—49 
defined,  3, 42 
dependent,  42—47 

measurement  strategies,  48, 1 1 1—112 
holding  constant,  88—92 
independent,  42—47 

factorial  design  and  multiple,  135 


measurement  strategies,  1 1 1—1 1 2 
varying,  48 
nuisance,  57 

equivalence  testing  and,  59 
quantitative,  49-50 

See  also  Database,  defining  variables  within  a 
Variance,  214-215 

Volunteers,  participant  effects  and,  78 

Waiver,  253 
Wes  daw,  32 

World  Medical  Association,  237 
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